transformers
c21e1071 - [deepspeed / m2m_100] make deepspeed zero-3 work with layerdrop (#16717)

Comment changes are shownComment changes are hidden
  • src/transformers/models/m2m_100
    • File
      modeling_m2m_100.py