Support fp32 gradaccum for bf16 model #2566
allow bf16 model with fp32 gradient accumulation datatype
e1369363
allow fp32 gradient accumulation and bfloat16 model in amp mode
7c57bd98
Merge branch 'master' into gma/support_fp32_gradaccum_for_bf16_model
eda2f83c
alternative fix for grad accumulation type mismatch. In the case of …
62cad7a1
Merge branch 'master' into gma/support_fp32_gradaccum_for_bf16_model
19d11394
tjruwase
approved these changes
on 2022-12-05
tjruwase
merged
06938835
into master 3 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub