onnxruntime
11b23ad2 - [CUDA] replace 90a-virtual by 90-virtual for forward compatible (#26230)

Commit

267 days ago

[CUDA] replace 90a-virtual by 90-virtual for forward compatible (#26230) Users with RTX 5090 GPUs are experiencing runtime errors when using onnxruntime-gpu: ``` [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Slice node. Name:'Slice_34' Status Message: CUDA error cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device ``` This occurs because RTX 5090 uses CUDA compute architecture 12.0 (SM 12.0). The incompatibility of `onnxruntime-gpu` 1.23 was built with `90a-virtual`. The `90a` architecture is a specialized, non-forward-compatible version of the Hopper architecture, making it incompatible with future GPU generations like Blackwell. This change will revert `90a-virtual` back to `90-virtual` as used in 1.22. This shall bring back the compatibility in Blackwell GPU. The FPA_INTB_GEMM is disabled by default. It need some extra work to make it compatible with 90-virtual and no 90a-real use case. Related: https://github.com/microsoft/onnxruntime/pull/26002 https://github.com/microsoft/onnxruntime/pull/26226 https://github.com/microsoft/onnxruntime/issues/26181

References

#26230 - [CUDA] replace 90a-virtual by 90-virtual for forward compatible

Author

tianleiwu

Parents

ffe16931

onnxruntime 11b23ad2 - [CUDA] replace 90a-virtual by 90-virtual for forward compatible (#26230)

onnxruntime
11b23ad2 - [CUDA] replace 90a-virtual by 90-virtual for forward compatible (#26230)