[MPS] Fix the ChannelsLast memory format in cat_out_mps() (#91786)
- Fixed the memory leak with the `malloc()`
- Introduced shortened data type strings (optional) to avoid getting extra long cached graph string keys with ops such as cat_out()
- Fixed data type issues in Monterey
- Removed the unused `use_scalar_value` argument from `getTensorsStringKey()`
- Clean up and refactoring
Fixes #89353
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91786
Approved by: https://github.com/kulinseth