[AArch64] Make width of stack protector guard value load configurable. (#195379)
Certain embedded targets store the value of the stack protector global
in an MMIO register, which requires a load of a specific width. Allow
changing the backend to emit a 4-byte load for the value of the stack
protector, instead of an 8-byte load. (Or vice versa for an ilp32
target.)
The current version of the patch has a limitation: it still allocates a
pointer-sized stack slot for the guard. This could be fixed in the
future, if it becomes relevant.