Megatron-DeepSpeed
Add option to normalize loss per target
#326
Merged

Add option to normalize loss per target #326

Muennighoff merged 11 commits into t0loading from lossseq
Muennighoff
Muennighoff Tmp lossseq
462efd99
Muennighoff Efficient loss normalization
992446c8
Muennighoff Reuse variable
616cfe86
Muennighoff Simplify division
900c8356
thomasw21
thomasw21 commented on 2022-08-10
Muennighoff Add norm_target_loss arg
7bc1dd20
Muennighoff Muennighoff changed the title TMP: Lossseq Add option to normalize loss per target 3 years ago
Muennighoff Muennighoff requested a review from thomasw21 thomasw21 3 years ago
thomasw21
thomasw21 commented on 2022-08-16
Muennighoff Clarify loss on targets & remove kwarg
fce1a98e
Muennighoff Loss mask is already float
2e7554d7
Muennighoff Move norm to batch pipe
a6b26240
Muennighoff Muennighoff requested a review from thomasw21 thomasw21 3 years ago
thomasw21
thomasw21 commented on 2022-08-17
thomasw21
thomasw21 commented on 2022-08-17
Muennighoff Reshape loss mask
549f4993
Muennighoff Move view
d9a91feb
Muennighoff Merge branch 't0loading' into lossseq
456327c1
Muennighoff Muennighoff merged 1e77844c into t0loading 3 years ago
Muennighoff Muennighoff deleted the lossseq branch 3 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone