DeepSpeed
fix num_kv_heads sharding in autoTP for the new in-repo Falcon-40B
#4654
Merged

Loading