transformers
06378d40 - fix: Initialize ApertusMLP's xielu activation using `torch_dtype` (#42864)

Commit
6 days ago
fix: Initialize ApertusMLP's xielu activation using `torch_dtype` (#42864) * Fix Apertus model crash on float16 hardware Initialize XIELU activation with correct dtype from config (using config.dtype instead of default bfloat16) to prevent promotion to float32 and subsequent crashes on Turing/float16 GPUs. * refactor: Move `ACT2CLS` import to top-level in Apertus models.
Author
Parents
Loading