xla
Cherry-pick 2.1 release branch into XRT branch through 9/14
#5574
Merged

Commits
  • Sharding should be per output of IR Node, instead of per IR Node (#5330)
    will-cromar committed 2 years ago
  • Update Python device API for SPMD (#5129)
    will-cromar committed 2 years ago
  • Check out the release branch instead of origin/master in ansible (#5344)
    will-cromar committed 2 years ago
  • Also dump output sharding on HLO file (#5339)
    will-cromar committed 2 years ago
  • Make all-reduce a no-op when world size is 1 (#5342)
    will-cromar committed 2 years ago
  • add fs linker flag (#5347)
    will-cromar committed 2 years ago
  • Add py3.10 whl path to doc, refactor whl table (#5354)
    will-cromar committed 2 years ago
  • fix amp dtype setting for GPU (#5337)
    will-cromar committed 2 years ago
  • Add python test for SPMD+Runtime Python API (#5349)
    will-cromar committed 2 years ago
  • Check the actual device instead of query env var for virtual device (#5352)
    will-cromar committed 2 years ago
  • [BE] use self.assertEquals instead of str equality in test_zero1.py (#5364)
    will-cromar committed 2 years ago
  • Revert "[BE] use self.assertEquals instead of str equality in test_zero1.py (#5364)" (#5366)
    will-cromar committed 2 years ago
  • [Dynamo|TPU] Tweak `atol` and `rtol` for `test_dynamo.py` (#5363)
    will-cromar committed 2 years ago
  • [Dynamo|TPU] Skip`DynamoTrainingBasicTest.test_resnet18` on TPU (#5362)
    will-cromar committed 2 years ago
  • Add a script for running stablehlo tests. (#5360)
    will-cromar committed 2 years ago
  • Don't rewrite index hints in global save planning (#5348)
    will-cromar committed 2 years ago
  • [Dynamo|TPU] Skip `DynamoInferenceBasicTest.test_resnet18` on TPU (#5361)
    will-cromar committed 2 years ago
  • [BE] use self.assertEquals instead of str equality in test_zero1.py (#5367)
    will-cromar committed 2 years ago
  • Fix ReplicateShardedData for int type (#5374)
    will-cromar committed 2 years ago
  • Update dynamo.md (#5378)
    will-cromar committed 2 years ago
  • Revert "Fix ReplicateShardedData for int type (#5374)" (#5380)
    will-cromar committed 2 years ago
  • Remove the mention of XRT_TPU_CONFIG in the CONTRIBUTING.md (#5379)
    will-cromar committed 2 years ago
  • [Dynamo|TPU] Tweak `atol` and `rtol` for `test_simple_model_with_different_input_shape` on TPU (#5373)
    will-cromar committed 2 years ago
  • Rectify test_zero1.py once optim.load_state_dict doesn't guarantee immutability (#5382)
    will-cromar committed 2 years ago
  • Add gpu doc for how to build PyTorch/XLA from source with GPU support. (#5384)
    will-cromar committed 2 years ago
  • clear pending ir should also clear the cc op tokens (#5385)
    will-cromar committed 2 years ago
  • Port resnet data loading optimizations to SPMD test script (#5386)
    will-cromar committed 2 years ago
  • Add support for in-place ops with self tensors in dynamo bridge (#5309)
    will-cromar committed 2 years ago
  • Add dynamo test in TPU CI (#5381)
    will-cromar committed 2 years ago
  • Add manual seed in multihost checkpoint (#5392)
    will-cromar committed 2 years ago
  • Fix change_id type in coverage uploading (#5394)
    will-cromar committed 2 years ago
  • Update dynamo cpu fallback op to aten::_foobar (#5393)
    will-cromar committed 2 years ago
  • Run single host multi GPU tests in the CI. (#5387)
    will-cromar committed 2 years ago
  • [PJRT] Separate collective ops test from TPU runtime test. (#5396)
    will-cromar committed 2 years ago
  • Fix ReplicateShardedData for int type (#5404)
    will-cromar committed 2 years ago
  • Update the dynamo backend name to `openxla` (#5402)
    will-cromar committed 2 years ago
  • [SPMD] Multi-host batch sharded data loading (#5331)
    will-cromar committed 2 years ago
  • Refactor to share code between export_torch_model and save_as_stablehlo (#5388)
    will-cromar committed 2 years ago
  • Fix TPU collective ops test for multi-host TPUs (#5408)
    will-cromar committed 2 years ago
  • Partially replicate lower-rank tensors (#5409)
    will-cromar committed 2 years ago
  • Revert "Partially replicate lower-rank tensors (#5409)" (#5412)
    will-cromar committed 2 years ago
  • SPMD cross slice-replication using partial_replication sharding (#5411)
    will-cromar committed 2 years ago
  • Fix the incorect clone arg condition in dynamo bridge (#5414)
    will-cromar committed 2 years ago
  • [SPMD] named partition spec support (#5415)
    will-cromar committed 2 years ago
  • [PJRT|TPU] Update `test_xla_devices_single_process_all_chips` for expected device number (#5421)
    will-cromar committed 2 years ago
  • Add repo for libcudnn8=8.7.0.84 and CUDA 11.8 (#5425)
    will-cromar committed 2 years ago
  • Update fix_includes.sh (#5441)
    will-cromar committed 2 years ago
  • [PJRT] Support `torchrun` with `pjrt://` `init_method` (#5438)
    will-cromar committed 2 years ago
  • Bugfix + add more test for llama (#5439)
    will-cromar committed 2 years ago
  • Move the C++ test build to CI build job instead of test job (#5442)
    will-cromar committed 2 years ago
  • Update gcc to 10. (#5445)
    will-cromar committed 2 years ago
  • Update the random seed for every dynamo execution (#5444)
    will-cromar committed 2 years ago
  • Revert "Update gcc to 10. (#5445)" (#5449)
    will-cromar committed 2 years ago
  • Install gcc-10 (#5450)
    will-cromar committed 2 years ago
  • Revert "Install gcc-10 (#5450)" (#5452)
    will-cromar committed 2 years ago
  • parallelize SPMD inputhandler and GetDataShards (#5447)
    will-cromar committed 2 years ago
  • Remove base image override from TPU CI build (#5453)
    will-cromar committed 2 years ago
  • Update to GCC 10 (#5451)
    will-cromar committed 2 years ago
  • Cache sharded placeholder for dynamo execution (#5446)
    will-cromar committed 2 years ago
  • Remove Docker image override from dev image (#5456)
    will-cromar committed 2 years ago
  • hack: implement (unimplement?) GetDataShard for XRT
    will-cromar committed 2 years ago
  • skip flaky test (#5459)
    will-cromar committed 2 years ago
  • Neuron import hook (#5429)
    will-cromar committed 2 years ago
  • Add missing includes (#5434)
    will-cromar committed 2 years ago
  • [GPU]Update README.md with wheel/docker for CUDA12.0 and deprecate CUDA11.7 (#5443)
    will-cromar committed 2 years ago
  • update remote cache key in ansible (#5463)
    will-cromar committed 2 years ago
  • Fix data type in Pow with Scalar base and Tensor exponent (#5467)
    will-cromar committed 2 years ago
  • bump the timeout for CI (#5470)
    will-cromar committed 2 years ago
  • Fix the input sharding for dynamo (#5469)
    will-cromar committed 2 years ago
  • Enabling sharding device data IR (#5475)
    will-cromar committed 2 years ago
  • Introduce `torch_xla.runtime.use_spmd()` (#5474)
    will-cromar committed 2 years ago
  • Enable PJRT C API Client and other changes for Neuron (#5428)
    will-cromar committed 2 years ago
  • Don't move full tensor to device in deferred_init (#4819)
    will-cromar committed 2 years ago
  • [SPMD] Fix HybridMesh ordering (#5478)
    will-cromar committed 2 years ago
  • [SPMD] Properly skip tests on TPU V2 (#5479)
    will-cromar committed 2 years ago
  • Add @yeounoh to .github CODEOWNERS (#5482)
    will-cromar committed 2 years ago
  • Add Python API to execute StableHLO bytecode (#5476)
    will-cromar committed 2 years ago
  • [SPMD] Fix TPU CI after #5478 (#5487)
    will-cromar committed 2 years ago
  • [SPMD] Fix XLA_DUMP_POST_OPTIMIZATIONS test (#5485)
    will-cromar committed 2 years ago
  • [Dist] Refactor ZeRO-1 (#5145)
    will-cromar committed 2 years ago
  • Update artifacts.auto.tfvars for 2.1 release (#5483)
    will-cromar committed 2 years ago
  • Add ShardingSpec to XLATensor when it is created with a PJRTShardedData (#5489)
    will-cromar committed 2 years ago
  • Add topological sorting to dynamo partitions (#5472)
    will-cromar committed 2 years ago
  • [SPMD] Patch nn.Linear (#5491)
    will-cromar committed 2 years ago
  • [original author: mrnikwaws] Neuron operator support (#5471)
    will-cromar committed 2 years ago
  • [SPMD] Make IR sharding custom sharding op (#5433)
    will-cromar committed 2 years ago
  • Support input sharding changed after first dynamo tracing (#5477)
    will-cromar committed 2 years ago
  • Always use ExecuteReplicated with SPMD (#5494)
    will-cromar committed 2 years ago
  • Skip a couple tests on TPU due to precision issue (#5496)
    will-cromar committed 2 years ago
  • Refactor stablehlo API and put them in official location. (#5493)
    will-cromar committed 2 years ago
  • Support tuples in partition spec (#5488)
    will-cromar committed 2 years ago
  • Add a API to explictly init runtime (#5500)
    will-cromar committed 2 years ago
  • Add explict error message when tensor is on CPU for dynamo backend (#5499)
    will-cromar committed 2 years ago
  • remove torchvision in stablehlo.py (#5501)
    will-cromar committed 2 years ago
  • Fix tupled partition spec test on v3 (#5503)
    will-cromar committed 2 years ago
  • Update dynamo doc (#5506)
    will-cromar committed 2 years ago
  • Update dynamo.md (#5509)
    will-cromar committed 2 years ago
  • Get original_traced_args as example_inputs. (#5511)
    will-cromar committed 2 years ago
  • mark_sharding over a replicated tensor is allowed. (#5513)
    will-cromar committed 2 years ago
  • [SPMD] Propagate replicated output (#5508)
    will-cromar committed 2 years ago
  • + more commits ...
Loading