DeepSpeed
6ba96289 - Fixing inference api for FP32 and non-masking GPT-based models (#1204)

Commit
4 years ago
Fixing inference api for FP32 and non-masking GPT-based models (#1204) * fixing inference api for FP32 and non-masking GPT-based models * use a dummy tensor if input_mask is none * fix input_mask * minor fix * send input_mask to compute_attn func for checking
Parents
Loading