Fix the sequence-parallelism for the dense model architecture (#4530)
Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Sam Ade Jacobs <samjacobs@microsoft.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>