DeepSpeed
1787673e - fix num_kv_heads sharding in uneven autoTP for Falcon-40b (#4712)

Commit
1 year ago
fix num_kv_heads sharding in uneven autoTP for Falcon-40b (#4712) Falcon-40b will fail on uneven autotp. Need to add 'num_kv_heads' in the kv_head_names list. Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Author
Parents
Loading