MPS cherry picks for 1.12.1 (#81976)
* MPS: Fixes (#78930)
Cast integer to float in UnaryOps
Add tensor dtype in key generation
Enable FP16 scalars and use placeholder for alpha tensor in add/sum ops
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78930
Approved by: https://github.com/albanD
* MPS: Binary cast fix by proper type promotion and remove spurious copy warning (#79185)
Fixes #78019, #78020
Fixes https://github.com/pytorch/pytorch/pull/79185
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79185
Approved by: https://github.com/albanD, https://github.com/razarmehr
* MPS: add exponential op (#79188)
Add exponential distribution
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79188
Approved by: https://github.com/razarmehr, https://github.com/albanD
* [MPS] Delete unused vars from OperationUtils.mm
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79514
Approved by: https://github.com/kulinseth, https://github.com/albanD
* [MPS] Fix getDefaultGenerator and copy_kernel_mps
Returning reference to stack memory is really bad
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79515
Approved by: https://github.com/albanD
* [MPS][BE]Do not use `new/delete[]` in `chainViewOperation`
`std::array` will do just fine
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79516
Approved by: https://github.com/albanD
* [MPS] Support stride of stride
Fixes https://github.com/pytorch/pytorch/issues/79181
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79521
Approved by: https://github.com/kulinseth
* MPS: TopK raise an error if K>16 (#79677)
* Error out in TopK when k>16.
* Add a test case too.
Fixes #78915
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79677
Approved by: https://github.com/albanD
* [MPS]: Add fix for squeezed input axes handling in BCE loss (#79676)
Fixes #79527
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79676
Approved by: https://github.com/razarmehr, https://github.com/albanD
* MPS: Add amax and amin Ops with tests (#79682)
* Add amax and amin with tests
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79682
Approved by: https://github.com/albanD
* [MPS] Fix torch.uint8 support (#80049)
`ScalarType.Byte` should be cast to `MPSDataTypeUInt8`
And support for `torch.int8` as well as test those conversions in `TestMPS.test_to`
Fixes #80006
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80049
Approved by: https://github.com/albanD
* [MPS] Fix binary ops between int32 tensor with int64 scalar (#80220)
For some reason, tensor *op* scalar does not follow the normal binary promotion rules
So cast output tensor to expected type if needed
It seems that one should have casted input tensors to expected output tensor type, but it does not really work for boolean binary ops, so...
Add output tensor type/shape to cached graph key
Extend `TestMPS. test_add_scalars` to test for this regression
Fixes #79835
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80220
Approved by: https://github.com/albanD
* [MPS] Add equal operator (#80195)
Which is, in essence is composite of `eq`->`all`->`item`
`native/mps/operators/Equal.cpp` is an almost verbatim copy of `native/cuda/Equal.cpp`
Fix codegen by generating MPSFunctions headers
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80195
Approved by: https://github.com/albanD
* [MPS] add `aten::normal.Tensor_float` `aten::normal.float_Tensor` `aten::normal.Tensor_Tensor` (#80297)
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80297
Approved by: https://github.com/albanD, https://github.com/kulinseth
* [MPS] Add flip (#80214)
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80214
Approved by: https://github.com/DenisVieriu97, https://github.com/albanD
* [MPS] Add logical ops (#80216)
This PR adds `logical_not`, `logical_and`, `logical_or`, `logical_xor`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80216
Approved by: https://github.com/albanD, https://github.com/kulinseth
* [MPS] Add glu (#79866)
Adds mps op for `aten::glu.out`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79866
Approved by: https://github.com/kulinseth, https://github.com/albanD
* [MPS] Fix std/var cache issue (#80502)
Use `getTensorsStringKey` which has tensor shape info added as part of the key to prevent cache lookup issue when the shape of input tensor is changed.
Fixes #80499
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80502
Approved by: https://github.com/malfet, https://github.com/kulinseth
* Add scatter support for view operations (#79939)
* Add scatter support for view operations; #78074, #78886, #79672
* Update test_slicing_replace_column to properly test different sizes
* Handle in-place changes for binary ops; add new testcase
* Add new view ops testing scatter; add MPSDebugConfig.h config file for debugging purposes
* Merge gatherViewTensor and scatterViewTensor into a generic function
* Add scatter on demand in scatterViewOperation instead of caching it into a generic graph
* Create separate graphs for scatter and gather;
* Create scatter graph at scatter time
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79939
Approved by: https://github.com/razarmehr
* MPS: Fix handling of 1D tensors in linear backward (#80759)
Fixes #https://github.com/pytorch/pytorch/issues/79784
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80759
Approved by: https://github.com/ezyang
* [MPS] Move the View ops to a separate file and reduce the number of graphs created (#80491)
This is dependent on the PR to go in first: https://github.com/pytorch/pytorch/pull/79939
Remove the data_ptr from the View Graph key which reduces the number of
graphs created significantly.
Don't wait when copying from MPS to MPS tensors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80491
Approved by: https://github.com/malfet
* [MPS] Add softplus backward (#79873)
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79873
Approved by: https://github.com/malfet
* [MPS] Add argmin (#80828)
This PR
1. adds argmin
2. refactors `reduction_type` in `ReduceOps.mm` with enum.
Co-authored by Kulin Seth <kulinseth@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80828
Approved by: https://github.com/malfet
* [MPS] Fix LSTM batch_first output transposed (#80597)
The output of LSTM with `batch_first` should be transposed back to batch first format.
Fixes #80306
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80597
Approved by: https://github.com/kulinseth
* [MPS][BE] Introduce MPSUnaryCachedGraph (#81033)
I.e. CachedGraph that has input and output tensors
Also, add `MPSGraphCache::LookUpAs` template, which combines LookUp with
static_cast to target type
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81033
Approved by: https://github.com/kulinseth
* [MPS] Add test consistency from OpInfo based tests from PR 78504 (#79532)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79532
Approved by: https://github.com/albanD, https://github.com/malfet
* [MPS] Add huber loss (#80163)
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80163
Approved by: https://github.com/kulinseth, https://github.com/malfet
* Remove two tests dependent on the MPS serialization checkin.
* Fix lint error (FLAKE8) F401
* Remove the serialization test from test_mps as its support is not there in 1.12.1.
Co-authored-by: Kulin Seth <kulinseth@gmail.com>
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
Co-authored-by: Kulin Seth <kulin_seth@apple.com>
Co-authored-by: Abhishek Pathak <abhipathak97@gmail.com>
Co-authored-by: Nikita Shulga <nshulga@fb.com>
Co-authored-by: qqaatw <qqaatw@gmail.com>
Co-authored-by: Ramin Azarmehr <razarmehr@apple.com>