Add all operators to H100 RE unit tests
Summary:
Test all operators in the unit test by default.
Fixes a couple of test errors:
* fp8_gemm (check and disable TMA kernels if non-exist)
* int4_gemm (update to the new interface defined by D59661613)
Operators that are still not tested:
* jagged_mean (CUDA OOM)
* jagged_layernorm (CUDA OOM)
Differential Revision: D59876967
fbshipit-source-id: 4d75353bfa7c9cb4bfa106adaf16f3b75183b226