Add meta registration for _foreach_norm (2nd try) (#119927)
The first try reused TensorListMetadata, which caused illegal memory access issues when there were too many tensors in the list. We just launch multiple kernels with a simpler version of the struct (to minimize kernels launched).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119927
Approved by: https://github.com/albanD