Give example on how to handle gradient accumulation with cross-entropy #3193
Add cross-entropy example in the gradient accumulation docs
2eb684e9
add example of logs
4d4ed806
correct skeleton code
3b8c8872
muellerzr
approved these changes
on 2024-10-24
replace gather_for_metrics with gather
c01827c1
batch_size -> per_device_batch_size
22cbf9c5
remove main_process_only=True
395c572d
add autoregressive example in examples/
2e80bf03
Update docs/source/usage_guides/gradient_accumulation.md
5e3e8118
ruff format
c56c7802
add grad accum test
80c720a9
update docs
e5d2c50b
Update examples/by_feature/gradient_accumulation_for_autoregressive_m…
0e1bb896
update tests
cc8bcc88
muellerzr
approved these changes
on 2024-12-11
SunMarc
approved these changes
on 2024-12-24
SunMarc
merged
acfbf72a
into main 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub