Commit
3 years ago
sync layer norms (#272) * sync layer norms * all_reduce is an in_place operation * Make dataloader use another random generator (#276) * do all_reduce op.AVG directly * add eval dataloader deadlock workaround * revert generator sync * make auto-sync configurable; basic test; cleanup * test with updated AMI image * fix unrelated test Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>
Author
Parents
Loading