[AArch64] Fix codegen for histograms with i64 increments (#181808)
Histograms don't do any legalisation on the loaded data type, so if the
'add' would need to be performed on a vector of i64's, then we can't use
the more optimal addressing with i32 offsets as that would return a
vector of nxv4i32 which wouldn't get widened.
This fixes https://github.com/llvm/llvm-project/issues/181764