DeepSpeed
Use odd shape tensor to represent parameter data in partitioned state
#981
Merged

Loading