DeepSpeed
Add Zenflow code for Stage 1 & 2
#7391
Merged

Commits
  • Add ZenFlow optimizers (zero stage 1&2) for ZeRO integration
    Antlera committed 90 days ago
  • Add ZenFlowConfig for optimizer configuration
    Antlera committed 90 days ago
  • Add ZenFlow (zero stage 1&2) integration in DeepSpeedEngine
    Antlera committed 90 days ago
  • Add unit tests for ZenFlowConfig
    Antlera committed 90 days ago
  • Fix initialization and update logic for ZenFlow optimizers
    Antlera committed 90 days ago
  • Add unit tests for ZenFlowSelectiveAdamW optimizer
    Antlera committed 90 days ago
  • Add ZenFlow tutorial documentation
    Antlera committed 90 days ago
  • Format code
    Antlera committed 90 days ago
  • Fix check_grad_overflow parameter in ZenFlowZeroOptimizer
    Antlera committed 90 days ago
  • Refactor ZenFlowZeroOptimizer methods to include communication data type
    Antlera committed 90 days ago
  • Merge remote-tracking branch 'upstream/master' into zenflow_zero1_2
    Antlera committed 90 days ago
  • Refactor ZenFlow integration in DeepSpeedEngine
    Antlera committed 90 days ago
  • Refactor ZenFlow function callings in DeepSpeedEngine
    Antlera committed 90 days ago
  • Merge branch 'master' into zenflow_zero1_2
    tohtana committed 87 days ago
  • Fix bugs in ZenFlow + ZeRO Stage 1 and gradient reduction logic
    JoshWoo2003 committed 87 days ago
  • Add unit tests for ZenFlow with ZeRO Stage 1 and 2
    JoshWoo2003 committed 87 days ago
  • Merge branch 'zenflow_zero1_2' of github.com:Antlera/DeepSpeed into zenflow_z1_2
    Antlera committed 87 days ago
  • Refactor ZenFlow integration using seperate engine file
    Antlera committed 86 days ago
  • Fix missing `[comm_dtype]` and format code
    Antlera committed 86 days ago
  • Merge branch 'master' into zenflow_zero1_2
    tohtana committed 85 days ago
  • Update CPUADAM core range calculation in zenflow_stage_1_and_2.py
    Antlera committed 85 days ago
  • Merge branch 'zenflow_zero1_2' of github.com:Antlera/DeepSpeed into clr_branch_zenflow_z1_2
    Antlera committed 85 days ago
  • Fix bugs in ZenFlow unit tests
    JoshWoo2003 committed 79 days ago
  • Merge branch 'master' into zenflow_zero1_2
    sfc-gh-truwase committed 75 days ago
  • Merge remote-tracking branch 'origin/zenflow_zero1_2' into clr_branch_zenflow_z1_2 to fix unit tests.
    Antlera committed 72 days ago
  • Merge branch 'zenflow_zero1_2' of github.com:Antlera/DeepSpeed into clr_branch_zenflow_z1_2
    Antlera committed 72 days ago
  • Fix: Add PyTorch version check for ZenFlow configuration
    Antlera committed 72 days ago
  • Merge branch 'master' into zenflow_zero1_2
    tohtana committed 71 days ago
  • Enhance ZenFlow compatibility checks for PyTorch version
    Antlera committed 71 days ago
  • Merge branch 'zenflow_zero1_2' of github.com:Antlera/DeepSpeed into clr_branch_zenflow_z1_2
    Antlera committed 71 days ago
  • Merge branch 'master' into zenflow_zero1_2
    loadams committed 56 days ago
  • Fix bugs in ZenFlow unit tests when using CPU Torch
    JoshWoo2003 committed 56 days ago
  • Merge branch 'master' into zenflow_zero1_2
    sfc-gh-truwase committed 55 days ago
  • Merge branch 'master' into zenflow_zero1_2
    tjruwase committed 55 days ago
  • Added TODO comments to indicate the need for removing ZenFlow-specific calls from the vanilla ZeroOptimizer.
    Antlera committed 55 days ago
  • Merge branch 'zenflow_zero1_2' of github.com:Antlera/DeepSpeed into clr_branch_zenflow_z1_2
    Antlera committed 55 days ago
  • Fix formatting in test_zf.py
    Antlera committed 55 days ago
  • Update docs/_tutorials/zenflow.md
    Antlera committed 55 days ago
  • Merge branch 'master' into zenflow_zero1_2
    sfc-gh-truwase committed 52 days ago
  • Merge branch 'master' into zenflow_zero1_2
    Antlera committed 47 days ago
  • Merge branch 'master' into zenflow_zero1_2
    sfc-gh-truwase committed 47 days ago
  • Fix copyrights.
    Antlera committed 47 days ago
  • Remove CUDA specific code.
    Antlera committed 47 days ago
  • Merge branch 'master' into zenflow_zero1_2
    Antlera committed 46 days ago
  • Merge branch 'master' into zenflow_zero1_2
    sfc-gh-truwase committed 45 days ago
  • Merge branch 'master' into zenflow_zero1_2
    Antlera committed 45 days ago
  • Merge branch 'master' into zenflow_zero1_2
    tjruwase committed 45 days ago
  • Merge branch 'master' into zenflow_zero1_2
    sfc-gh-truwase committed 42 days ago
  • Merge branch 'master' into zenflow_zero1_2
    sfc-gh-truwase committed 42 days ago
Loading