onnxruntime
d3463337 - [CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM (#25802)

Commit
208 days ago
[CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM (#25802) ### Description Add a build flag to enable/disable mixed gemm cutlass kernel. To disable the kernel, you can append the following at the end of build command line: `--cmake_extra_defines onnxruntime_USE_FPA_INTB_GEMM=OFF` ### Motivation and Context FpA IntB Gemm need a lot of time to compile. With such option, developer can speed up the build especially on build machine with limited memory.
Author
Parents
Loading