DeepSpeed
6f77da1b - Add `scale_attn_by_inverse_layer_idx` feature (#2486)

Commit
3 years ago
Add `scale_attn_by_inverse_layer_idx` feature (#2486) * Add scale_attn_by_inverse_layer_idx feature * Fix layer_id bug * Fix scaling value Co-authored-by: Connor Holmes <connorholmes@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
Author
Parents
Loading