Experimental support for single-host TPU in PjRt (#3550)
* Experimental support for single-device TPU in PjRt.
* Distribute TPU execution across multiple replicas.
* Fix compilation device.
* Formatting
* string_to_device -> string_to_device_
* Fix issues found in model testing
* Disable strict shape checking.
* Use absl::AsciiStrToUpper instead of std::transform