DeepSpeed
bf60fc0c - Support safetensors export (#6579)

Comment changes are shownComment changes are hidden
Commit
241 days ago
Support safetensors export (#6579) ## Feature This commit implements the following features: - [x] support saving checkpoint as safetensors (more commonly used format) - [x] support sharding checkpoints (which is important for very large models) Most of the codes are borrowed from https://github.com/huggingface/transformers/blob/v4.45.1/src/transformers/modeling_utils.py#L2490 ## Usage For `pytorch_model.bin` export ``` python zero_to_fp32.py . output_dir/ ``` For `model.safetensors` export ``` python zero_to_fp32.py . output_dir/ --safe_serialization ``` --------- Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Author
Parents
  • deepspeed/utils
    • File
      zero_to_fp32.py