transformers
860b898d
- fix: astronomical loss with ModernBERT when using gradient checkpointing (#38982) (#38983)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
187 days ago
fix: astronomical loss with ModernBERT when using gradient checkpointing (#38982) (#38983) * fix: astronomical loss with ModernBERT when using gradient checkpointing * update the modling fix --------- Co-authored-by: Arthur <arthur.zucker@gmail.com>
References
#38983 - fix: astronomical loss with ModernBERT when using gradient checkpointing (#38982)
Author
umarbutler
Parents
a2eb75c8
Loading