Add fp16 support of Qwen1.5 models (0.5B to 72B) to DeepSpeed-FastGen (#5219)
Qwen1.5 is the beta version of Qwen2
This PR adds support for Qwen1.5 models 0.5B to 72B
### Test Code
for mii pipeline:
```python
import mii
pipe = mii.pipeline("Qwen/Qwen1.5-0.5B")
responses = pipe("DeepSpeed is", max_new_tokens=128, do_sample=False)
if pipe.is_rank_0:
print(responses[0])
```
for huggingface:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation import GenerationConfig
import torch
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-0.5B")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen1.5-0.5B", device_map="auto", torch_dtype=torch.float16, trust_remote_code=True).eval()
inputs = tokenizer('DeepSpeed is', return_tensors='pt')
inputs = inputs.to(model.device)
pred = model.generate(**inputs, max_new_tokens=128, do_sample=False, repetition_penalty=1.0)
test = tokenizer.decode(pred.cpu()[0], skip_special_tokens=False)
print(test)
```
### Qwen1.5-0.5B
Huggingface output with prompt "DeepSpeed is":
```
a new and innovative way to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you
```
DeepSpeed-FastGen output with prompt "DeepSpeed is":
```
a new and innovative way to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you
```
### Qwen1.5-72B-Chat
Huggingface output with prompt "DeepSpeed is" (for nice display, I use
''' to replace ```):
```
为 PyTorch 提供的深度学习训练加速库,它集成了多种优化技术,包括混合精度训练、模型并行、数据并行、ZeRO 内存优化等,可以显著提高模型训练的速度和效率。以下是一个简单的使用 DeepSpeed 的例子:
首先,你需要安装 DeepSpeed。在你的终端中运行以下命令:
'''bash
pip install deepspeed
'''
然后,你可以使用 DeepSpeed 来训练你的 PyTorch 模型。以下是一个简单的例子,使用 ResNet-50 训练 CIFAR-10 数据集:
'''python
import torch
```
DeepSpeed-FastGen output with prompt "DeepSpeed is" with 8-way sharding:
```
为 PyTorch 提供的深度学习训练加速库,它集成了多种优化技术,包括混合精度训练、模型并行、数据并行、ZeRO 内存优化等,可以显著提高模型训练的速度和效率。以下是一个简单的使用 DeepSpeed 的例子:
首先,你需要安装 DeepSpeed。在你的终端中运行以下命令:
'''bash
pip install deepspeed
'''
然后,你可以使用 DeepSpeed 来训练你的 PyTorch 模型。以下是一个简单的例子,使用 ResNet-50 训练 CIFAR-10 数据集:
'''python
import torch
```
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>