onnxruntime
8a08225c - [Build] Fix clang build issues for CPU and CUDA builds (#27669)

Commit

33 days ago

[Build] Fix clang build issues for CPU and CUDA builds (#27669) ## Description This PR fixes clang-specific build failures that show up in both the standalone clang build and the CUDA clang build. It keeps the build-system changes targeted, prefers source fixes where the warnings indicate real type or declaration issues, and avoids broader warning suppression than necessary for the CUDA provider target. ## Summary of Changes ### Build System | File | Change | |------|--------| | `cmake/CMakeLists.txt` | Stop forwarding `-Wshorten-64-to-32` through CUDA host compilation where the GNU host compiler does not recognize it. | | `cmake/onnxruntime_providers_cuda.cmake` | Add targeted clang `-Wno-error` handling for warning classes that are currently triggered by CUDA provider code and third-party CUDA headers under clang. | ### CPU / Common clang fixes | File | Change | |------|--------| | `onnxruntime/core/common/cpuid_info.cc` | Replace the clang-incompatible `__builtin_cpu_supports("waitpkg")` path with the CPUID-bit check for TPAUSE detection. | | `onnxruntime/test/framework/allocation_planner_test.cc` | Refactor `typeid` assertions to avoid clang's potentially-evaluated-expression warning while keeping test coverage unchanged. | ### CUDA provider and contrib fixes | File | Change | |------|--------| | `onnxruntime/contrib_ops/cuda/utils/dump_cuda_tensor.h` | Mark the `IConsoleDumper` overrides explicitly while leaving CUDA-only overloads unchanged. | | `onnxruntime/contrib_ops/cuda/bert/group_query_attention.cc` | Use `template` on the dependent `GetAttrOrDefault` call so clang parses it correctly. | | `onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_api.cc` | Make narrowing conversions to flash-attention parameter fields explicit. | | `onnxruntime/contrib_ops/cuda/quantization/matmul_nbits.cc` | Make the `nbits_` conversion explicit when calling the CUDA helper. | | `onnxruntime/contrib_ops/cuda/quantization/moe_quantization.cc` | Restrict the GCC-only warning pragma so clang does not treat it as an unknown warning option. | | `onnxruntime/contrib_ops/cuda/transformers/generation_device_helper.cc` | Fix explicit state-field assignments to use the actual `int` field type. | | `onnxruntime/core/providers/cuda/cuda_mempool_arena.h` | Remove an unused private field that clang flagged in the CUDA provider build. | ## Testing Tested CPU and CUDA 12.8 builds in Azure Linux with - clang 18.1.8 - gcc 13.2 - cmake 4.2.3 Example for CPU build: ``` export CC=clang export CXX=clang++ bash build.sh --config RelWithDebInfo --parallel --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=ON ``` ## Motivation and Context Clang is stricter than GCC/MSVC in a few areas that affect this tree: CUDA host flag forwarding, explicit narrowing, dependent template parsing, warnings emitted from third-party CUDA headers, and RTTI/typeid expressions in tests. The goal here is to keep the staged fix minimal and maintainable by correcting real source issues where practical and confining warning downgrades to the CUDA provider target where third-party header noise is currently unavoidable.

References

#27669 - [Build] Fix clang build issues for CPU and CUDA builds

Author

tianleiwu

Parents

f3cc7fff

onnxruntime 8a08225c - [Build] Fix clang build issues for CPU and CUDA builds (#27669)

onnxruntime
8a08225c - [Build] Fix clang build issues for CPU and CUDA builds (#27669)