[AMDGPU] Generalize global.load.lds to buffer fat pointers
Direct load to LDS can also be implemented on buffer fat pointers,
using the pointer as the offset to raw.buffer.ptr.load.lds. This
commit generalizes the existing intrinsic to support this usage.