Bfloat16 zero2 #1398

tjruwase merged 26 commits into deepspeedai:master from raamjad:bfloat16_zero2
raamjad
raamjad Changes for bfloat16 Zero2
888b52bb
raamjad Cleaned up additional comments and debugging code
8a6a8348
raamjad Initial merge of BFloat16 changes to latest DeepSpeed master, still n…
a14fe096
raamjad Adapted fp16_master_weights_and_grads option to cover BF16
0946a669
raamjad Reverted fp16_master_weights_and_gradients extension to BFloat16 and …
470b9ad1
raamjad Merge branch 'master' into bfloat16_zero2
042cd4d6
raamjad raamjad requested a review from awan-10 awan-10 4 years ago
raamjad raamjad requested a review from cli99 cli99 4 years ago
raamjad raamjad requested a review from conglongli conglongli 4 years ago
raamjad raamjad requested a review from eltonzheng eltonzheng 4 years ago
raamjad raamjad requested a review from jeffra jeffra 4 years ago
raamjad raamjad requested a review from minjiaz minjiaz 4 years ago
raamjad raamjad requested a review from niumanar niumanar 4 years ago
raamjad raamjad requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 4 years ago
raamjad raamjad requested a review from samyam samyam 4 years ago
raamjad raamjad requested a review from ShadenSmith ShadenSmith 4 years ago
raamjad raamjad requested a review from tjruwase tjruwase 4 years ago
ghost
raamjad Fixed formatting and variable naming errors recognized in testing
897da429
raamjad Merge branch 'master' into bfloat16_zero2
967cdf79
szhengac
raamjad Merge branch 'master' into bfloat16_zero2
ddd59cb7
raamjad Merge branch 'master' into bfloat16_zero2
470aafec
raamjad Merge branch 'master' into bfloat16_zero2
7a08c607
raamjad Merge branch 'master' into bfloat16_zero2
2cfc6182
tjruwase Merge branch 'master' into bfloat16_zero2
24a4cdaf
tjruwase
tjruwase
tjruwase Merge branch 'master' into bfloat16_zero2
2f22e0cb
raamjad
raamjad Added relevant unit tests for bfloat16 with ZeRO-2
f32658f3
raamjad
raamjad Merge branch 'master' into bfloat16_zero2
3bfeb69c
tjruwase
tjruwase commented on 2021-10-27
tjruwase
tjruwase
tjruwase
raamjad
tjruwase Merge branch 'master' into bfloat16_zero2
49479391
raamjad Updates conditions for skipping BFloat16 unit tests
e1a7f445
raamjad Merge branch 'bfloat16_zero2' of github.com:raamjad/DeepSpeed into bf…
f11dc8f0
tjruwase Merge branch 'master' into bfloat16_zero2
8a6ca1e8
tjruwase
tjruwase approved these changes on 2021-10-29
raamjad Added check for NCCL inconsistent version naming convention
6895d9ea
tjruwase Merge branch 'master' into bfloat16_zero2
f1368433
raamjad Merge branch 'bfloat16_zero2' of github.com:raamjad/DeepSpeed into bf…
11c6576e
tjruwase Merge branch 'master' into bfloat16_zero2
e4218dcd
raamjad
tjruwase Merge branch 'master' into bfloat16_zero2
1025ac8d
tjruwase
raamjad Update skip message for Bfloat16 tests to mention additional checks
fbc4571c
raamjad
tjruwase tjruwase merged 648f7bfa into master 4 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone