Fix ccall return value boxing on ARM/AArch64
We previously relies on the extra allocation from the GC to keep the stores inbounds.
This is broken by the allocation optimization since the stack allocation will only have
the requested bytes and not more.