[NVPTX] Fixup EXT_LOAD lowering for i128 values (#138049)
Ensure that when custom lowering a vector load/store to a multi-output
load/store node we confirm that the memory value type matches the type
used by the node. Also add some asserts for basic sanity checking of
load size.
Fixes https://github.com/llvm/llvm-project/issues/138034