DeepSpeed
DeepSpeedZeroOptimizer: refactor bit16 flattening to support more accelerators
#4833
Merged

DeepSpeedZeroOptimizer: refactor bit16 flattening to support more accelerators #4833

nelyahu
ZeroOptimizer: avoid storage sharing when flatenning params
45d7a034
nelyahu Merge branch 'microsoft:master' into zeroOptParamsFlatenning
324e33f9
nelyahu nelyahu requested a review from tjruwase tjruwase 2 years ago
nelyahu nelyahu requested a review from mrwyattii mrwyattii 2 years ago
nelyahu nelyahu changed the title DeepSpeedZeroOptimizer: refactor bit16 flatenning to support more accelerators DeepSpeedZeroOptimizer: refactor bit16 flattening to support more accelerators 2 years ago
tjruwase
tjruwase commented on 2023-12-18
tjruwase
tjruwase commented on 2023-12-18
tjruwase
tjruwase commented on 2023-12-18
tjruwase
tjruwase commented on 2023-12-18
tjruwase
tjruwase commented on 2023-12-18
tjruwase tjruwase assigned tjruwase tjruwase 2 years ago
tjruwase Merge branch 'master' into zeroOptParamsFlatenning
2c794564
Use meta tensor instead of reconstructing CPU tensors
cbd6c70b
tjruwase Merge branch 'master' into zeroOptParamsFlatenning
067e079f
tjruwase
tjruwase commented on 2024-01-02
fix orig_group_numel: missing accumulation
8b0d4cee
tjruwase Merge branch 'master' into zeroOptParamsFlatenning
2e41d9ef
tjruwase
tjruwase approved these changes on 2024-01-03
tjruwase Merge branch 'master' into zeroOptParamsFlatenning
890ae610
mrwyattii Merge branch 'master' into zeroOptParamsFlatenning
5ade6863
fix orig_group_numel: move accumulation two lines above
c10b626f
nelyahu
tjruwase Merge branch 'master' into zeroOptParamsFlatenning
57b01129
tjruwase tjruwase merged ade98365 into master 2 years ago
nelyahu nelyahu deleted the zeroOptParamsFlatenning branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
Labels
Milestone