benchmark
1cfc58db - Add a single GPU variant of modded-nanogpt to torchbench (#2660)

Commit

185 days ago

Add a single GPU variant of modded-nanogpt to torchbench (#2660) Summary: X-link: https://github.com/pytorch/pytorch/pull/169505 X-link: https://github.com/pytorch/pytorch/pull/169502 ## Tests Standalone: `python -m torchbenchmark.models.modded_nanogpt.main` Through dynamo benchmarks: `python benchmarks/dynamo/torchbench.py --performance --training --amp --backend inductor --device cuda --only modded_nanogpt --disable-cudagraphs` This PR adds a tweaked version of the Aug 23rd record for the nanogpt speedrun (GPT-2 small variant): https://github.com/KellerJordan/modded-nanogpt/blob/9d9dc969c451c87b7ad3c84f807db2c2d9109f41/train_gpt.py. The later records can not be ran without building FA3 from source, so we will ommit them until the dynamo FA3 PR is merged. The tweaks are to library-ify the script by commenting out everything other than the model class definitions, to change the pg initialization to use fake pg, and constant-ify some hyperparameters. The tests run locally, but this model specifically requires H100. I wasn't sure how to filter for that, so I skipped all the tests. This will be tested on the dynamo benchmark side: https://github.com/pytorch/pytorch/pull/169449. Pull Request resolved: https://github.com/pytorch/benchmark/pull/2660 Reviewed By: BoyuanFeng Differential Revision: D88233265 Pulled By: xmfan fbshipit-source-id: 6894823c4593e68d048f59fd05a091d67bf03756

Author

xmfan

Committer

meta-codesync[bot]

Parents

b45277a4

benchmark 1cfc58db - Add a single GPU variant of modded-nanogpt to torchbench (#2660)

benchmark
1cfc58db - Add a single GPU variant of modded-nanogpt to torchbench (#2660)