Megatron-DeepSpeed
Reweighting strat for prefix lm
#190
Merged

Reweighting strat for prefix lm #190

thomasw21
thomasw21 First test to un bias the loss for prefix lm
c8d3243f
thomasw21 Woops
226cf712
thomasw21 thomasw21 requested a review from ibeltagy ibeltagy 4 years ago
thomasw21 thomasw21 requested a review from TevenLeScao TevenLeScao 4 years ago
thomasw21 Add same code for not deepspeed mode
a58c0413
thomasw21 Improve testing
3ce2154b
thomasw21 Woops
74fabfb7
thomasw21 Test moving it inside?
53e64030
thomasw21 This changes the normalization factor in loss computation
2981380b
thomasw21 Fix
8a83121f
thomasw21 Woops
de8c56d4
thomasw21 Better refactoring of loss normalization
48953683
thomasw21 thomasw21 merged b3cf1755 into main 4 years ago
thomasw21 thomasw21 deleted the thomas/reweight_tokens_depending_on_their_position branch 3 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone