Add the missing view operations from sequence parallel(async). (#6750)
FYI @loadams
a view operation was missing in some updates compared to the original
version
https://github.com/microsoft/DeepSpeed/blob/17ed7c77c58611a923a6c8d2a3d21d359cd046e8/deepspeed/sequence/layer.py#L56
add missing view operation.
The shape required for the view cannot be easily obtained in the current
function, so refactor layout params code.
---------
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>