DeepSpeed
Support fp32 gradaccum for bf16 model
#2566
Merged

Support fp32 gradaccum for bf16 model #2566

delock
delock allow bf16 model with fp32 gradient accumulation datatype
e1369363
delock allow fp32 gradient accumulation and bfloat16 model in amp mode
7c57bd98
delock delock requested a review from jeffra jeffra 3 years ago
delock delock requested a review from tjruwase tjruwase 3 years ago
tjruwase
delock
delock
tjruwase Merge branch 'master' into gma/support_fp32_gradaccum_for_bf16_model
eda2f83c
tjruwase
danyang-rainbow
delock alternative fix for grad accumulation type mismatch. In the case of …
62cad7a1
delock
jomayeri
tjruwase Merge branch 'master' into gma/support_fp32_gradaccum_for_bf16_model
19d11394
tjruwase
tjruwase
tjruwase approved these changes on 2022-12-05
tjruwase tjruwase merged 06938835 into master 3 years ago
danyang-rainbow

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone