[AArch64] Replace uaddlv with addv for popcount operation (#121934)
Replace `uaddlv` with `addv` for popcount operation as it is simpler
operation.
On certain platforms like Cortex-A510, `addv` has a latency of 3 cycles
whereas `uaddlv` has a latency of 4 cycles
GCC generates `addv` as well:
https://godbolt.org/z/MnYG9jcEo