Commits
  • Better
    thomasw21 committed 4 years ago
  • Force synchronize the layer norms parameters across all TP
    thomasw21 committed 4 years ago
  • import mpu
    stas00 committed 4 years ago
  • use the bf16 branch for testing
    stas00 committed 4 years ago
  • `torch.testing.assert_equal` didn't make it (#273)
    stas00 committed 4 years ago
  • Merge remote-tracking branch 'origin/main' into thomas/fix_layer_norm
    stas00 committed 4 years ago
  • bf16 comms requite pt-1.11
    stas00 committed 4 years ago
  • already part of the function
    stas00 committed 4 years ago
  • reproduce the crashing on resume
    stas00 committed 4 years ago
  • run just the test we want for now
    stas00 committed 4 years ago
  • all_reduce is an in_place operation
    thomasw21 committed 4 years ago
  • Make a test that TP reshaping works
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Fix load issue
    thomasw21 committed 4 years ago
  • Woops
    thomasw21 committed 4 years ago
  • Fix checkpoint path
    thomasw21 committed 4 years ago
  • Test that force sync will allow TP changes
    thomasw21 committed 4 years ago
  • Nit
    thomasw21 committed 4 years ago
  • Now that we have a force sync mechanism, let's try to reproduce
    thomasw21 committed 4 years ago
  • Compare model_states_rank
    thomasw21 committed 4 years ago
  • test
    thomasw21 committed 4 years ago
  • Row column bias should be synchronized as well
    thomasw21 committed 4 years ago
  • New list of matching embeddings
    thomasw21 committed 4 years ago
  • Figure out why state differs
    thomasw21 committed 4 years ago
  • Test for final weight
    thomasw21 committed 4 years ago
  • Test that torch_rng_state
    thomasw21 committed 4 years ago
  • Fix non matching torch_rng_state for tp_rank=0
    thomasw21 committed 4 years ago
  • Update test
    thomasw21 committed 4 years ago
  • I'm surprised one can apply inplace operation here
    thomasw21 committed 4 years ago
  • Test out the loss from the fp32 weights and optimizer states
    thomasw21 committed 4 years ago
Loading