fix: astronomical loss with ModernBERT when using gradient checkpointing (#38982) #38983
fix: astronomical loss with ModernBERT when using gradient checkpointing
50e8ff13
SunMarc
approved these changes
on 2025-06-23
update the modling fix
0a583d55
Merge branch 'main' of github.com:huggingface/transformers into patch-3
6a62df7c
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub