DeepSpeed
240c2a7b - Fix fused_qkv print model ValueError (#7109)

Commit
260 days ago
Fix fused_qkv print model ValueError (#7109) Suppose qkv_linear_weight_shape = [in_features, out_features]. The qkv linear weight shape is [3, in_features, out_features] if using fued_qkv gemm optimization. It will cause "ValueError: too many values to unpack (expected 2)" issue when printing the model. Solution: Take the last two weight dimensions shapes as in_features and out_features. Signed-off-by: Lai, Yejing <yejing.lai@intel.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Author
Committer
Parents
Loading