transformers
8b3b9b48 - exclude fsdp from delay_optimizer_creation (#34140)

Comment changes are shownComment changes are hidden
Commit
189 days ago
exclude fsdp from delay_optimizer_creation (#34140) * exclude fsdp from delay_optimizer_creation * add test case for trainer: FSDP mode and fp8 as mixed precision * rearrange imports * ruff formatted * adapt _init_fsdp to fp8 * use _init_fsdp only when resume_from_checkpoint * In case of FDP, self.layer will be CheckpointWrapper which has no len() method * delete _init_fsdp * solve conflict * fix conflict * make fixup
Author
Parents
  • src/transformers
    • File
      testing_utils.py
    • File
      trainer.py
  • tests/trainer
    • File
      test_trainer_fsdp.py