codegen: add a pass for late conversion of known modify ops to call atomicrmw (#57010)
The ExpandAtomicModify can recognize our pseudo-intrinsic
julia.atomicmodify and convert it into some of known atomicrmw
expressions, or simplify it with more inlining, as applicable. Ideally
we could get this pass upstreamed, since there's nothing specific to
julia about this pass, and LLVM's IR cannot express this quite correctly
without making it a new intrinsic.
This ensures that now our `@atomic` modify is as fast as
`Threads.Atomic`!
```llvm
julia> @code_llvm Threads.atomic_add!(r, 10)
; Function Signature: atomic_add!(Base.Threads.Atomic{Int64}, Int64)
; @ atomics.jl:307 within `atomic_add!`
define i64 @"julia_atomic_add!_2680"(ptr noundef nonnull align 8 dereferenceable(8) %"x::Atomic", i64 signext %"v::Int64") #0 {
top:
; ┌ @ Base_compiler.jl:94 within `modifyproperty!`
%0 = atomicrmw add ptr %"x::Atomic", i64 %"v::Int64" acq_rel, align 8
; └
; ┌ @ Base_compiler.jl:54 within `getproperty`
ret i64 %0
; └
}
```