fix falcon-40b accuracy issue (#4895)

Commit

2 years ago

fix falcon-40b accuracy issue (#4895) This [PR](https://github.com/microsoft/DeepSpeed/pull/4721) added the "DecoderLayer":glmtype. It will cause the Falcon model to choose "glmtype" fused_qkv_type. Falcon model (including Falcondecoderlayer) needs to choose 'bloomtype' explicitly. Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>