DeepSpeed
bc76b04e - Add the missing view operations from sequence parallel(async). (#6750)

Comment changes are shownComment changes are hidden
Commit
140 days ago
Add the missing view operations from sequence parallel(async). (#6750) FYI @loadams a view operation was missing in some updates compared to the original version https://github.com/microsoft/DeepSpeed/blob/17ed7c77c58611a923a6c8d2a3d21d359cd046e8/deepspeed/sequence/layer.py#L56 add missing view operation. The shape required for the view cannot be easily obtained in the current function, so refactor layout params code. --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>
Author
Parents
  • deepspeed/sequence
    • File
      layer.py