[AMDGPU][InsertWaitCnts] Make HWEvent a BitMask (#203864)
Follow up from comments on
https://github.com/llvm/llvm-project/pull/202886
Make HWEvent a bitmask by default instead of having both the enum, and a
separate HWEventSet. This has the advantage of streamlining the code a
bit and opening the possibility of adding "modifiers" to events, e.g. I
imagine we could now fold "VMemType" into the Events.
We already do this with things like SMEM_GROUP. At least now it's baked
into the design.
I opted for a bit more verbosity by taking inspiration from
FastMathFlags (FMF): instead of exposing a raw enum, I wrap it in a
class w/ helper function. The downside is having to reimplement all the
little bitwise ops, but the result is a cleaner, simpler interface than
a raw enum (class) w/ many helper functions. I initially tried that but
I recoiled at the sight of things like `contains(A, B)` which isn't very
clear, while `A.contains(B)` is self explanatory.
Considering HWEvent is a bitmask, I also implemented a simple iterator
to iterate over all set bits of the mask, which is a useful thing to
have as some APIs in InsertWaitCnt rely on treating one event at a time.