transformers
Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs
#31629
Merged

Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs #31629

RhuiDih
ArthurZucker
ArthurZucker commented on 2024-07-10
RhuiDih
RhuiDih RhuiDih force pushed from cf6271fa to c3451dbc 1 year ago
RhuiDih
ArthurZucker
ArthurZucker commented on 2024-07-15
mayank31398
RhuiDih
wynterl
RhuiDih
ArthurZucker
ArthurZucker approved these changes on 2024-07-19
ArthurZucker
fxmarty
fxmarty commented on 2024-07-19
HuggingFaceDocBuilderDev
add DataCollatorBatchFlattening
df3c9b29
Update data_collator.py
dfe08de7
change name
8120b3a3
new FA2 flow if position_ids is provided
0598510e
add comments
1ff23436
minor fix
f97ab716
minor fix data collator
08e1f2cb
add test cases for models
48ce3d20
add test case for data collator
6c6b1688
remove extra code
00e7abf6
RhuiDih RhuiDih force pushed to 00e7abf6 1 year ago
formating for ruff check and check_repo.py
3a872933
ruff format
c60b720d
custom_init_isort.py
90305596
ArthurZucker
ArthurZucker approved these changes on 2024-07-23
ArthurZucker ArthurZucker merged 9cf4f2aa into main 1 year ago
olisicky
RhuiDih
olisicky
RhuiDih
olisicky
MahmoudAshraf97
ArthurZucker
qibao77
RhuiDih
weiran-work

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone