DeepSpeed
sequence parallel with communication overlap
#5691
Merged

Commits
  • fix ds-sp grad scale for zero0
    inkcherry committed 1 year ago
  • enable o compute async
    inkcherry committed 1 year ago
  • enable qk bwd async all2all
    inkcherry committed 1 year ago
  • fwd optimi
    inkcherry committed 1 year ago
  • fix1 remove linear arg, remove note
    inkcherry committed 1 year ago
  • async qkv fwd, optimi cpu ,make fwd call fast
    inkcherry committed 1 year ago
  • update
    inkcherry committed 1 year ago
  • refine code
    inkcherry committed 1 year ago
  • refine code
    inkcherry committed 1 year ago
  • Revert "fix ds-sp grad scale for zero0"
    inkcherry committed 1 year ago
  • Merge remote-tracking branch 'upstream/master' into sp_overlap_comm
    inkcherry committed 1 year ago
  • fix format
    inkcherry committed 1 year ago
  • fix format
    inkcherry committed 1 year ago
  • refine code
    inkcherry committed 1 year ago
  • add register for v, ensuring they launch on a single thread.
    inkcherry committed 1 year ago
  • Merge branch 'master' into sp_overlap_comm
    tjruwase committed 1 year ago
  • remove v
    inkcherry committed 1 year ago
  • remove v
    inkcherry committed 1 year ago
  • fix notes and format
    inkcherry committed 1 year ago
  • Merge branch 'master' into sp_overlap_comm
    HeyangQin committed 1 year ago
Loading