llvm-project
fab5b185 - Reland "[NVPTX][AtomicExpandPass] Complete support for AtomicRMW in NVPTX (#176015)" (#179553)

Commit
4 days ago
Reland "[NVPTX][AtomicExpandPass] Complete support for AtomicRMW in NVPTX (#176015)" (#179553) This PR adds full support for atomicrmw in NVPTX. This includes: - Memory order and syncscope support (changes in AtomicExpandPass.cpp, NVPTXIntrinsics.td) - Script-generated tests for integer and atomic operations (atomicrmw.py, atomicrmw-sm*.ll in tests/CodeGen/NVPTX). Existing atomics tests which are subsumed by these have been removed (atomics-sm*.ll, atomics.ll, atomicrmw-expand.ll). - ~~Changes shouldExpandAtomicRMWInIR to take a constant argument: This is to allow some other TargetLowering constant-argument functions to call it. This change touches several backends. An alternative solution exists, but to me, this seems the "right" way.~~ Has been split out into https://github.com/llvm/llvm-project/pull/176073. Rebased. - NOTE: The initial load issued for atomicrmw emulation loops (and cmpxchg emulation loops) must be a strong load. Currently, AtomicExpandPass issues a weak load. Fixing this breaks several backends. I'm planning to follow up with a separate PR. Initially failed due to error: ptxas fatal : Value 'sm_60' is not defined for option 'gpu-name'. Updated RUN lines in atomicrmw-sm*.py to skip the ptxas-verify check if ptxas does not support that SM version.
Parents
Loading