DeepSpeed
bcc617a0 - Add fp16 support of Qwen1.5 models (0.5B to 72B) to DeepSpeed-FastGen (#5219)

Commit
1 year ago
Add fp16 support of Qwen1.5 models (0.5B to 72B) to DeepSpeed-FastGen (#5219) Qwen1.5 is the beta version of Qwen2 This PR adds support for Qwen1.5 models 0.5B to 72B ### Test Code for mii pipeline: ```python import mii pipe = mii.pipeline("Qwen/Qwen1.5-0.5B") responses = pipe("DeepSpeed is", max_new_tokens=128, do_sample=False) if pipe.is_rank_0: print(responses[0]) ``` for huggingface: ```python from transformers import AutoModelForCausalLM, AutoTokenizer from transformers.generation import GenerationConfig import torch tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-0.5B") model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen1.5-0.5B", device_map="auto", torch_dtype=torch.float16, trust_remote_code=True).eval() inputs = tokenizer('DeepSpeed is', return_tensors='pt') inputs = inputs.to(model.device) pred = model.generate(**inputs, max_new_tokens=128, do_sample=False, repetition_penalty=1.0) test = tokenizer.decode(pred.cpu()[0], skip_special_tokens=False) print(test) ``` ### Qwen1.5-0.5B Huggingface output with prompt "DeepSpeed is": ``` a new and innovative way to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you ``` DeepSpeed-FastGen output with prompt "DeepSpeed is": ``` a new and innovative way to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. It is a cloud-based data storage solution that allows you to store and retrieve data in the cloud. DeepSpeed is a cloud-based data storage solution that allows you ``` ### Qwen1.5-72B-Chat Huggingface output with prompt "DeepSpeed is" (for nice display, I use ''' to replace ```): ``` 为 PyTorch 提供的深度学习训练加速库,它集成了多种优化技术,包括混合精度训练、模型并行、数据并行、ZeRO 内存优化等,可以显著提高模型训练的速度和效率。以下是一个简单的使用 DeepSpeed 的例子: 首先,你需要安装 DeepSpeed。在你的终端中运行以下命令: '''bash pip install deepspeed ''' 然后,你可以使用 DeepSpeed 来训练你的 PyTorch 模型。以下是一个简单的例子,使用 ResNet-50 训练 CIFAR-10 数据集: '''python import torch ``` DeepSpeed-FastGen output with prompt "DeepSpeed is" with 8-way sharding: ``` 为 PyTorch 提供的深度学习训练加速库,它集成了多种优化技术,包括混合精度训练、模型并行、数据并行、ZeRO 内存优化等,可以显著提高模型训练的速度和效率。以下是一个简单的使用 DeepSpeed 的例子: 首先,你需要安装 DeepSpeed。在你的终端中运行以下命令: '''bash pip install deepspeed ''' 然后,你可以使用 DeepSpeed 来训练你的 PyTorch 模型。以下是一个简单的例子,使用 ResNet-50 训练 CIFAR-10 数据集: '''python import torch ``` Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Author
Parents
Loading