[AMDGPU] Use true16 loads with +real-true16 and sram-ecc (#161256)
When sram-ecc is enabled 16-bit loads clobber full 32-bit VGPR.
A load into a just 16-bit VGPR is not possible. Do a 16-bit
extending load and extract a 16-bit subreg in this situation.
Also fixes lack of 16-bit store patterns with this combination.
Fixes: SC1-6072