transformers
9cf4f2aa - Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629)

Commit
1 year ago
Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629) * add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py
Author
Parents
Loading