Megatron-DeepSpeed
bdc6ad6d - Fix glu activation (#148)

Commit
4 years ago
Fix glu activation (#148) * Make sure to use glu activation when specified * Woops forgot DS config * Upsample ffn_hidden_size when glu is used * Woops * Replace assert with raising exception instead * fix bug Co-authored-by: Stas Bekman <stas@stason.org>
Author
Parents
Loading