Megatron-DeepSpeed
b5a029d9 - Expose GLU activations as arguments (#69)

Commit
4 years ago
Expose GLU activations as arguments (#69) * feat: expose glu activations as argument * chore: rename activations -> glu_activations * refactor: use lookup dict instead of `getattr()` * refactor: mv lookup dict to `glu_activations.py` * chore: rm unnecessary default arg * test: add bf16 test; gelu in `test_training_all()` * Update megatron/testing_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * refactor: use `require_torch_bf16` decorator * chore: comment out bf16 test uncomment in the future when torch supports gelu kernels for bf16 * consistent style * fix look up table * better grouping * fix: replace hard coded options with `GLU_ACTIVATIONS` Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org>
Author
Parents
Loading