fix mlu device longTensor bugs (#2887)
* Add Cambricon MLU accelerator support
* up mlu support for test
* fix mlu device MULTI_MLU
* Update src/accelerate/utils/imports.py
it's beautiful !
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* up mlu for quality check
* fix mlu device longTensor error
* fix mlu device tensor dtype check
* fix mlu device send_to_device with torch dynamo error
* Refactor AcceleratorState
* Should be near complete now
* Last missing piece
* Make my way to the acceleratorstate
* Include update to global var
* Don't use global
* gpu -> cuda
* Don't use update for dict, easier to read
* Fix tests
* stash
* Getting closer...
* Needed to spawn at the very end after env was setup
* Explain set_device before deepspeed
* Make docstring more accurate
* Early return insteaD
* Delineat blocks
* Make prepare_backend return state + backend for clarity/less magic
* fix mlu longtensor.to() bugs.
---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>