Megatron-DeepSpeed
Add multiple evaluation compat
#336
Open

Commits
  • Tmp lossseq
    Muennighoff committed 3 years ago
  • Efficient loss normalization
    Muennighoff committed 3 years ago
  • Reuse variable
    Muennighoff committed 3 years ago
  • Simplify division
    Muennighoff committed 3 years ago
  • Add norm_target_loss arg
    Muennighoff committed 3 years ago
  • Clarify loss on targets & remove kwarg
    Muennighoff committed 3 years ago
  • Loss mask is already float
    Muennighoff committed 3 years ago
  • Move norm to batch pipe
    Muennighoff committed 3 years ago
  • Reshape loss mask
    Muennighoff committed 3 years ago
  • Move view
    Muennighoff committed 3 years ago
  • Merge branch 't0loading' into lossseq
    Muennighoff committed 3 years ago
  • Add multiple evaluation compat
    Muennighoff committed 3 years ago
  • Set iteration to args by default
    Muennighoff committed 3 years ago
Loading