[rebase] rebase fori_loop_simple_case_test #7165
Add doc explaining how it works (#6937)
f44e1dfc
Fix profiling in benchmark script (#6934)
310f08dc
[Pallas] Integrate FlashAttention with SPMD (#6935)
9f2b82dc
Use TPU build for CPU and GPU Python tests (#6921)
b2556d6c
[Fori_loop|While_loop] Create fori_loop.md (#6942)
0417d4d5
Update XLA pin, 04/19/2024 (#6944)
2ec77062
Update jinja and sphinx versions to address the vulnearbility (#6946)
9ba844a4
Make `nms` fallback by default. (#6933)
b06c9c77
revert expand test with dynamo (#6950)
62a2b11c
Lower embedding bag forward only (#6951)
46919a47
Cleanup the code example in the torch_xla2 README. (#6939)
6fd448dc
[torch_xla2] Simplify developer setup steps (#6905)
89efd178
Create rc13 trigger (#6956)
fa090a24
update torch deps to 2.3 (#6959)
b5574d83
update rc14 (#6962)
69eeace3
[Doc] Update Pallas user guide (#6961)
a7749fa1
Add final 2.3 trigger (#6963)
5369e7d6
Temporarily ignore torch commit in CI test (#6964)
76f7dd06
[Doc] Improve docker instructions (#6969)
af74c349
Enable PagedAttention through Pallas (#6912)
6ed20260
Update readme for 2.3 releae (#6967)
0a204a6b
Update GPU readme (#6968)
abe090ad
Write `torch_xla.version.__torch_gitrev__` to file directly (#6966)
0054ec08
fix pytorch CI after pin update, change test to use assertLessEqual (…
2a204e9b
Update Openxla-pin to 04/24 (#6975)
4b481349
Move test_grad_checkpoint.py to tpu test list (#6976)
2bf59e0c
Revert "Update Openxla-pin to 04/24" (#6980)
023e2c83
Update CODEOWNERS for build infrastructure (#6953)
3f5ff0f5
Move `.torch_pin` and handle in ansible (#6920)
b3be775a
Update dynamo test to be less constrain (#6981)
b9a9449f
Build CPP tests in new CI workflow (#6947)
b834e499
Run TPU CI when label is on PR (#6984)
174f4077
Add readme to call a model (lost due to merge conflicts) (#6986)
6443e593
Fix permission issues during CI checkout (#6985)
6d01bb6e
[Revert Revert] Update OpenXLA-pin update to Apr25 (#6982)
971ebe1f
Update test_export_fx_passes.py (#6972)
75278161
Change name of CI documentation (#6994)
7cc78a68
Rework docs push (#6954)
c91171d5
`sudo rm` leftover files in GHA (#6995)
0e032b17
Fixes to dynamic_shapes args in test_unbounded_dynamism.py (#6999)
73b915b5
Manually push to `gh-pages` branch (#6996)
d25f4752
Re-land: dynamo expand test with view-replay. (#6958)
42db7096
Move the nightly whl instruction out of the hide area (#7000)
87329ce0
Don't fail docs push if there's nothing to commit (#7001)
5f75290e
Complain when TensorFlow is installed (#7004)
4a5e238e
Clean up workspace before test (#7005)
b8f8fa9e
Tag CI build with git hash (#7003)
77bbf7f3
fix addbmm opinfo (#6993)
2399e10f
Fix more opinfo tests (#7008)
2907ab30
Fix q dtype in paged attention kernel (#7011)
865836ad
Build upstream CI image on push to master (#6952)
4883f6fe
[Pallas] Support segment ids in flash attention (#6943)
400bd0c9
Support pin pr number in new .torch_pin (#6998)
cbbefa2c
Expose python unsafe buffer pointer (#7006)
9d84df2a
Fix torch.full scalar type (#7010)
0a54b2b1
Update doc push workflow guide and requirement (#7002)
93ce0541
Update XLA/JAX/libtpu pins to 2024/05/02 (#7020)
213b72b9
Support export custom op to stablehlo custom call (#7017)
666eccb4
Add CI overview to `ci.md` (#7015)
aad6a12a
Update non_xla attention to properly support paged_attention dynamo c…
2bce3f83
pass TPU_ML_PLATFORM_VERSION env to libtpu (#7021)
d1235858
Handle multiple inplace update input output aliasing (#7023)
e3fc0331
Add a missing case for _unsafe_buffer_pointer tests. (#7026)
a0063724
Move op dispatching logic into an `Environment` class; and use Mode t…
825ba0da
[Pallas] Support FA sm_scale (#7035)
1c31cde0
Remove date tag for dev images (#7036)
b543dc0e
[Pallas] Improve FlashAttention segment_ids test case (#7034)
887d3446
[Pallas] Improve segment_ids API UX (#7037)
5bbe5c82
add a metrics to track persistent cache loading time (#7039)
c1b745e5
Adding megablox gmm standalone (#6940)
40f7e1f5
[FSDPv2] Support MultiSlice (#7044)
6f0b61e5
[FSDPv2] Fix test_fsdp_v2_multi_slice (#7055)
4a1588c3
[Pallas] Add test_pallas_spmd.py to tpu ci (#7045)
5cb473a2
Add option to export FX Node metadata to StableHLO (#7046)
6f392ccc
Pin update 20240513 (#7052)
2b28ae29
Add example to support pytorch lightning; misc bug fixes (#7054)
b64d8a2a
Add simple example for how to use torch_xla (#7048)
ae63cd1e
Update runner and runner-container-hooks versions (#7058)
f26c35c2
Support megacore_mode in paged_attention (#7060)
cbb9e213
reenable disabled pt2e test (#7059)
df0d147e
add amp example (#7062)
1fa1f858
Update TPU CI debugging tips (#7066)
a8eae0d9
add DDP with SPMD example (#7063)
c6074abd
Support torch_xla2 benchmarking using torchbench (#7013)
9e189350
[benchmarks] Fix AMP setup for torchbench models. (#7067)
aeed89eb
format the output model input (#6869)
68daf61f
Map jnp.int4 to torch.int8 (#7071)
a6ee8a50
Add a unit test for MoE layer. (#7069)
8247aec1
Add VSCode devcontainer instructions to CONTRIBUTING.md (#7072)
56c2368c
Add `xla.step` context manager (#7068)
3c59087e
[benchmarks] Add default value to `move_to_device`. (#7080)
8990f1b9
Fix overflow for `div` arguments. (#7081)
a2540acb
Fix usage of extract_jax (#7075)
e0d5a49a
Add example for a decoder only model (#7082)
e0fb8782
add example for fsdp (#7061)
961c22ae
Implement `ComputationClient::GetMemoryInfo` (#7086)
5409cd5b
Dump HLO HBM usage info (#7085)
206f1b7f
Add data-type promotion to `gelu_backward`. (#7090)
8d35eb05
add missing aten op (#7078)
5e1d454e
Add dlpack support (#7025)
60238557
Add torch_xla2 `export_program_to_stablehlo` API with unbounded dynam…
baf08aea
Add FSDPv2 example for the decoder only model (#7088)
f336317e
Update spmd doc (#7096)
8a1ada88
Add examples for how to benchmark a PyTorch/XLA model (#7089)
0ce06eca
reorganize the example dir (#7097)
c294625d
[Pallas] Refactor the gmm kernel (#7099)
5327033b
add example for flash attnetion (#7098)
8ae4c769
chore(doc): fix typos in FDSPv2 doc (#7104)
7350b702
Add data-type promotion to `stack`. (#7091)
a299f337
fix jax dependency bug (#7105)
cb8533be
[Pallas] Introduce _make_group_metadata (#7107)
22e912ea
implement Repeat with fixed output shape (#7114)
3369bf7d
[Pallas] Support _histogram (#7115)
cb805837
Update XLA pin to 2024/05/24 and fix Hermetic Python integration (#7110)
7d31f7de
[Pallas] Make gmm functional (#7117)
a9b4fadf
Only use remote cache for main repository, not forks (#7112)
1a8c2fe2
[Pallas] Make gmm output a tensor (#7120)
65b5ace8
[Pallas] Set a better tiling for gmm (#7119)
fd4900ce
`index`: fix index of 0-element tensor by 0-element tensor. (#7113)
be3b08e6
Use remote cache for `push` events too (#7124)
7770a494
remove torch_xla2 jax dependency in native install (#7126)
ed90be1e
[MoE] Test sorting lhs for gmm (#7121)
8531d1c5
[Pallas] Make gmm support bf16 (#7133)
fb373129
[FSDPv2] Shard on the maximal dim of weights (#7134)
15fc0f1c
Update XLA pin as of 20240528 (#7131)
6f406b77
update example dir's README (#7136)
c7bbdfbb
`upsample_bilinear`: fix output data-type. (#7111)
468a5c93
Add function for retrieving fallback operations. (#7116)
6d271232
Facebook->Meta in README (#7141)
0eded8da
Update contribute to include bdist_wheel (#7135)
c367b66f
[Pallas] Support tgmm (#7137)
50d81b3e
Revert "`upsample_bilinear`: fix output data-type." (#7142)
f1141980
Add Python 3.11 version for release build (#7143)
bd70d3f0
add test for onehot to make sure no fallback (#7127)
ad73e070
Disable kokoro and Update without key (#7106)
0ca090c3
Remove old checkout files from GitHub workspace (#7146)
ffbbd438
[Pallas] Make repeat_with_fixed_output_size not OOM on VMEM (#7145)
ce1205e1
[Pallas] Introduce gmm_backward (#7151)
c96c95a4
Deprecate XLA_USE_BF16 (#7150)
af51f063
[Pallas] Introduce GMM(torch.autograd.Function) (#7152)
aeed61a9
Make from_dlpack handle cuda synchronization implicitly for input ten…
daada224
add PT_XLA_DEBUG_LEVEL (#7149)
8c2234ee
Add optimizer priming for dist chkpt (#6572)
8fd051f2
[Doc] Update spmd.md for doc (#7019)
cb482bca
Update configuration.yaml (#7158)
8471826a
Add a CI workflow for tests that requires pytorch CUDA. (#7140)
6fadbf5d
ManfeiBai
marked this pull request as ready for review 1 year ago
ManfeiBai
merged
332bd402
into fori_loop_simple_case_test 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub