DeepSpeed
wrap include cuda_bf16.h with ifdef BF16_AVAILABLE
#6520
Merged

Loading