onnxruntime
6625856b - Add support for CUDA architecture family codes (#27278)

Commit
32 days ago
Add support for CUDA architecture family codes (#27278) This change extends CUDA architecture handling to support family-specific codes (suffix 'f') introduced in CUDA 12.9, aligning with updates made to Triton Inference Server repositories (backend and onnxruntime_backend). Changes: 1. Added CUDAARCHS environment variable support (standard CMake variable) - Allows users to override architecture list via environment variable - Takes precedence when CMAKE_CUDA_ARCHITECTURES is not set 2. Extended regex patterns to recognize family code suffix 'f' - Supports codes like 100f, 110f, 120f for CC 10.x, 11.x, 12.x families - Preserves 'f' suffix during parsing phase 3. Updated normalization logic to handle family codes - Family codes (ending with 'f') preserved without adding -real suffix - Traditional codes continue to receive -real or -a-real suffixes - Architecture-specific codes (with 'a') remain unchanged 4. Extended architecture support lists - Added SM 110 to ARCHITECTURES_WITH_KERNELS - Added SM 110 to ARCHITECTURES_WITH_ACCEL Family-specific codes (introduced in CUDA 12.9/Blackwell) enable forward compatibility within a GPU family. For example, 100f runs on CC 10.0, 10.3, and future 10.x devices, using features common across the family. Usage examples: - CUDAARCHS="75;80;90;100f;110f;120f" cmake .. - cmake -DCMAKE_CUDA_ARCHITECTURES="75-real;80-real;90-real;100f;120f" .. - python build.py --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES="100f;110f" The implementation supports mixed formats in the same list: - Traditional: 75-real, 80-real, 90-real - Architecture-specific: 90a-real (CC 9.0 only) - Family-specific: 100f, 110f, 120f (entire family) Note: Current defaults still use bare numbers (75;80;90;100;120) which normalize to architecture-specific codes with 'a' suffix. Users who want family-specific behavior should explicitly use the 'f' suffix via CUDAARCHS environment variable or CMAKE_CUDA_ARCHITECTURES. References: - NVIDIA Blackwell and CUDA 12.9 Family-Specific Architecture Features: https://developer.nvidia.com/blog/nvidia-blackwell-and-nvidia-cuda-12-9-introduce-family-specific-architecture-features/ - Triton Inference Server backend updates (commit f5e901f) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
Author
Parents
Loading