DeepSpeed
541e423a - Enable tensor fragments for zero 2 & 3 (#2727)

Commit

2 years ago

Enable tensor fragments for zero 2 & 3 (#2727) * Enable tensor fragments for zero 2 * Update deepspeed/utils/tensor_fragment.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update deepspeed/utils/tensor_fragment.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Support offload * Support multi-gpu * Cleanup * WIP * Update deepspeed/runtime/zero/stage3.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Support padding * Update deepspeed/runtime/zero/stage3.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * z3 optimizer state support; aligned api * Support frozen z3 params * Unit tests * Check NVMe offload capability * Formatting * Docs * More docs * More docs * Update docs/code-docs/source/zero3.rst Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * More docs * Update docs/code-docs/source/zero3.rst Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * More docs * More docs * Update docs/code-docs/source/zero3.rst Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update deepspeed/utils/tensor_fragment.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * More docs * Support unsharded fp32 grad * Remove debug prints * Fix off-by-one detection of empty grads * Update deepspeed/utils/tensor_fragment.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update deepspeed/utils/tensor_fragment.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update deepspeed/utils/tensor_fragment.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update deepspeed/runtime/zero/stage3.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Fix off-by-one error * Skip ranks with no gradient data * Formatting * Add license * Fix license --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>

References

#2727 - Enable tensor fragments for zero 2 & 3

Author

tjruwase

Parents

da84e60d

Files10

deepspeed
- runtime
  - bf16_optimizer.py
  - zero
    - stage3.py
    - stage_1_and_2.py
- utils
  - __init__.py
  - mixed_precision_linkage.py
  - tensor_fragment.py
  - zero_to_fp32.py
docs/code-docs/source
- zero3.rst
tests/unit
- runtime/zero
  - test_zero_tensor_fragment.py
- util.py

DeepSpeed 541e423a - Enable tensor fragments for zero 2 & 3 (#2727)

DeepSpeed
541e423a - Enable tensor fragments for zero 2 & 3 (#2727)