transformers
Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs
#31629
Merged

Loading