[SPARC] Weaken emitted barriers for atomic ops (#154950)
Weaken barriers for atomic ops to the form that's just enough to enforce
memory model constraints.
In particular, we try to avoid emitting expensive #StoreLoad barriers
whenever possible.
The barriers emitted conform to V9's RMO and V8's PSO memory model, and
is compatible with GCC's lowering.
A quick test with `pgbench` on a T4-1 shows some small (up to about 4%),
but consistent speedup.