Add FBGEMM submodule (#2293)
Summary:
This PR does the following:
- Upgrade the default CUDA version to 12.4.
- Pre-install fbgemm_gpu genai kernels to the nightly docker.
Pull Request resolved: https://github.com/pytorch/benchmark/pull/2293
Test Plan:
Build base image: https://github.com/pytorch/benchmark/actions/runs/9476276319
Build nightly docker: https://github.com/pytorch/benchmark/actions/runs/9486161032
Reviewed By: aaronenyeshi
Differential Revision: D58471717
Pulled By: xuzhao9
fbshipit-source-id: 9d2e0b45b7cba4af1cb7578daec001605ee03985