onnxruntime
8fceca0d - [Build] update cuda 13 package: fatbin compress mode and cuda archs (#26516)

Commit

169 days ago

[Build] update cuda 13 package: fatbin compress mode and cuda archs (#26516) ### Changes Update cuda 13 python packaging pipeline: (1) Use fatbin compress mode = size to reduce package size. This could significantly reduce package size. (2) Update CMAKE_CUDA_ARCHITECTURES for cuda 13. Since we reduced package size, we are able to add more architectures. (3) Fix cuda 13 packaging pipeline: - use correct (cuda13 instead of cuda12) manylinux docker. The new linxu docker has cuda 13.0.2 and cuDNN 9.14. - pass cuda version properly to run build_linux_python_package.sh in docker. (CUDA_VERSION in docker was 12.8.1, and now we pass "12.8" from yml to be consistent). Note that the compress mode and cuda archs settings are not changed for CUDA 12.8, so cuda 12 wheel size is larger than cuda 13 wheel size. We can update them in a separated PR if needed. The nuget pipeline for cuda 13 need extra code change, and this PR only fixes python packaging pipeline. ### Python GPU Wheel Size (Cuda Architectures + PTX) CUDA | Windows | Linux ----|---|--- 12.8 | 221 MB (52;61;75;86;89+90) | 271 MB (60;70;75;80;86;90a+90) 13.0 | 186 MB (75;80;86;89;90a;100a;120a+120) | 191 MB (75;80;86;89;90a;100a;120a+120)

References

#26516 - [Build] update cuda 13 package: fatbin compress mode and cuda archs

Author

tianleiwu

Parents

a4e44a47

onnxruntime 8fceca0d - [Build] update cuda 13 package: fatbin compress mode and cuda archs (#26516)

onnxruntime
8fceca0d - [Build] update cuda 13 package: fatbin compress mode and cuda archs (#26516)