DeepSpeed
separate add and mul flops compute function
#1745
Merged

Loading