transformers
cefb819f - Mamba `slow_forward` gradient fix (#29563)

Commit
1 year ago
Mamba `slow_forward` gradient fix (#29563) * FIX: Cached slow forward in mamba - additionally added mamba cached test - added unused test (mamba causal lm forward and backward) - fixed typo: "causl" --> "causal" * formatting * fix: use real `slow_forward` call instead of torch module's * add shape assertion for mixer block test * adjust shape assertion
Author
Parents
Loading