Coding with Qwen 2.5: An Overview


Image by Editor (Kanwal Mehreen) | Canva

 

Large language models (LLMs) have been used in many applications, and many companies are racing to employ the best models available for their competitive advantages. This, in turn, makes many open-source initiatives develop the best model compared to the others.

One of the models to watch for is the Qwen family model, developed by the Alibaba team. They aim to focus on building open source generalist models, including large language and multimodal models, that could compete even against closed source models.

With the release of their latest model, Qwen2.5, the team has shown that their model has significantly improved in many areas. With so much potential, we will try out the model ourselves.

This article will explore Qwen2.5 and how to use it in your work.

 

Qwen2.5 Model Family

 
As mentioned, Qwen2.5 is the latest series of Qwen models from the Alibaba group. Seven different parameters are available, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B, as well as base and instruct variants. Moreover, the model can support longer contexts of up to 128k tokens and generate up to 8k tokens with support for 29 different languages.

The release is also accompanied by the specialized coding and mathematics models Qwen2.5-Coder and Qwen2.5-Math, respectively. Both are valuable models for specific use cases and are available immediately via HuggingFace.

Let’s try out the model ourselves. First, we need to install the library used for this code example.

pip install transformers optimum auto-gptq qwen-vl-utils flash-attn --no-build-isolation

 

We will use the PyTorch frameworks, so select the one that is most appropriate for your environment.

Let’s try using the pipeline transformers for text generation. In this example, we will use the Quantized 7B Instruction variant.

from transformers import pipeline
messages = [
    "role": "user", "content": "Who are you in one sentence?"
]
pipe = pipeline("text-generation", model="Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4", device=0)
result = pipe(
    messages,
    max_length=100,        
    temperature=0.7,      
    top_p=0.9,            
    repetition_penalty=1.2  
)

print(result)

 

Output:

'I am Qwen, an AI assistant created by Alibaba Cloud designed to help users generate various types of text content.'

 

We can see from the result above that the model can follow the prompt instructions well. Let’s try out with a prompt that pushes for more creativity. We will change the parameters a little bit to decrease the token size and increase the temperature.

messages = [
  "role": "user", "content": "Write a short story about a robot who learns to paint landscapes."
]
result = pipe(
    messages,
    max_length=200,        
    temperature=0.8,      
    top_p=0.9,            
    repetition_penalty=1.2  
)

print(result)

 

Output:

'In the bustling metropolis of Neo-Tokyo, there lived an unassuming robot named Pixel. Unlike his fellow robots in the factory where he was born, Pixel had always been fascinated with art and nature. His mechanical arms were often seen moving rhythmically as if painting invisible strokes on air'

 

As we can see from the output above, the model can creatively create a simple story. If we set the max_length to a higher number, then the story will become much longer.

Lastly, we can test with different roles, such as the system and user, to better detail the result.

messages = [
  "role": "system", "content": "You are a helpful and concise assistant who uses simple language.",
  "role": "user", "content": "Explain how photosynthesis works."
]

 

Output:

Sure! Photosynthesis is the process by which plants, algae, and some bacteria make their own food using sunlight. Here’s a simplified breakdown of how it works:\n\n1. **Light Absorption**: Plants have green pigments called chlorophyll in structures called chloroplasts within their cells. Chlorophyll captures light energy from the sun'

 

The result is good enough, especially with the system role, when we can change the style as we want.

Next, we will try out various models in Qwen-2.5. To make it easier, we will create a function to call the model.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

def qwen_model_testing(model_name, prompt):


  model = AutoModelForCausalLM.from_pretrained(
      model_name,
      torch_dtype="auto",
      device_map="auto"
  )
  tokenizer = AutoTokenizer.from_pretrained(model_name)


  messages = [
      "role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant.",
      "role": "user", "content": prompt
  ]

  text = tokenizer.apply_chat_template(
      messages,
      tokenize=False,
      add_generation_prompt=True
  )
  model_inputs = tokenizer([text], return_tensors="pt").to(model.device)


  generated_ids = model.generate(
      **model_inputs,
      max_new_tokens=512
  )
  generated_ids = [
      output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
  ]

  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
  print(response)

 

First, we will use the smaller instruction parameter variants to see how it is performing. We will also try zero-shot prompting to instruct the model as a sentiment analyzer model.

model_name = "Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4"
prompt = """Classify the following sentence as either "Positive" or "Negative" in sentiment:

  "Overall, I wouldn't recommend this product to anyone."
  Output:"""

qwen_model_testing(model_name, prompt)

 

Output:

 

Then, we will try out the Coder variant, which is the model that was trained specifically for code generation, code refinement, and code review. In this example, we will try to generate Python while asking them to explain the concepts.

model_name = "Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4"
prompt = "Explain the time complexity of the binary search algorithm and provide a short Python code snippet that implements it."

qwen_model_testing(model_name, prompt)

 

Output:

The time complexity of the binary search algorithm is O(log n), where n is the number of elements in the sorted array on which the search is performed. This is because binary search works by repeatedly dividing the search interval in half. If the value of the search key is less than the item in the middle of the interval, narrow the interval to the lower half. Otherwise, narrow it to the upper half. Repeatedly check until the value is found or the interval is empty.

Here is a simple Python code snippet implementing binary search:

```python
def binary_search(arr, target):
    low = 0
    high = len(arr) - 1
    
    while low  target:
            high = mid - 1
        else:
            low = mid + 1
            
    return None

# Example usage:
arr = [1, 3, 5, 7, 9]
target = 5
result = binary_search(arr, target)
if result is not None:
    print(f"Element found at index result")
else:
    print("Element not found")
```

In this code, `binary_search` takes an array `arr` and a `target` value as inputs. It initializes two pointers, `low` and `high`, to represent the current search interval. The loop continues until `low` exceeds `high`. In each iteration, it calculates the middle point `mid`, checks if the middle element is the target, and adjusts the search interval accordingly. If the target is not found, it returns `None`

 

The result is quite good, with the explanation adequate with the instruction given.

Lastly, we will try the Math variant, which is the specially trained Qwen model for the mathematical reasoning process.

model_name = "Qwen/Qwen2.5-Math-7B-Instruct"
prompt = "Find the next number in this sequence and explain your reasoning: 2, 9, 30, 93, ..."

qwen_model_testing(model_name, prompt)

 

Output:

To find the next number in the sequence 2, 9, 30, 93, ..., we need to identify the pattern in the sequence. Let's examine the differences between consecutive terms:

- The difference between the second term (9) and the first term (2) is \(9 - 2 = 7\).
- The difference between the third term (30) and the second term (9) is \(30 - 9 = 21\).
- The difference between the fourth term (93) and the third term (30) is \(93 - 30 = 63\).

Now, let's look at the sequence of these differences: 7, 21, 63. We notice that each term in this new sequence is three times the previous term:

- \(21 = 7 \times 3\)
- \(63 = 21 \times 3\)

Following this pattern, the next term in the sequence of differences should be \(63 \times 3 = 189\).

To find the next term in the original sequence, we add this difference to the last term of the original sequence:

\[93 + 189 = 282\]

Therefore, the next number in the sequence is \(\boxed282\).

 

The model is able to reason well with the given mathematical sequential problem, which shows that it has the potential to help various problems in the future.

You can check out the Qwen Documentation for further reading.

 

Conclusion

 
Qwen 2.5 is the newest addition from the Qwen family by the Alibaba group. Seven different parameters are available with base and instruction variants. There are also coder and math-specialized models that were released for the Qwen 2-5 family.

In this article, we use example code to demonstrate how to use the Qwen 2.5 model and its variants.

I hope this has helped!
 
 

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here