DeepSpeed
f4cb866c - estimate_zero2_model_states_mem_needs: fixing memory estiamtion (#5099)

Commit
1 year ago
estimate_zero2_model_states_mem_needs: fixing memory estiamtion (#5099) was considering 4 bytes per model param, and 4 bytes per gradient. fixed it to 2 bytes - under the assumption of FP16/BF16 --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Author
Parents
Loading