update #2533

juliagmt-google wants to merge 97 commits into juliagmt/test from main
juliagmt-google
anijain2305 Revert "Trace enter/exit of TorchFunctionModes (#135422)" (#136590)
a31c3fe9
xuzhao9 Add nsys integration
2edf80cb
ostrowskimarcin Fix bug #2458 (#2459)
0f050151
bertmaher Restore FlexAttention and FlashV3 backward (#2473)
611bf702
mark14wu Fix hardcoded shape in low_mem_dropout benchmark (#2475)
252a3b17
bertmaher Make FA3 work in fbcode
b6b67a4f
bertmaher Skip loading triton.nvidia.cublas if not found
0611c41c
bertmaher Print TMA benchmark info to stderr
0cb1e96d
bertmaher Modernize cutlass call for fp8 blockwise
2d9ab0b1
bertmaher CSV of extra shapes for gemm benchmarks
d512e673
bertmaher Add layout options to gemm
4445aa2b
htyu Enable fp8 rowwise on AMDGPU (#2483)
f2932b74
xuzhao9 Ignore Torchbench CI on Tritonbench paths (#2481)
a8ce4b5a
jovianjaison Add _dynamo.config inline_inbuilt_nn_modules and specialize_float log…
737084ec
karthik-man Add non-persistent fp8 triton_rowwise kernel (#2484)
6b4f3393
xuzhao9 Bump transformer version (#2488)
12820bcc
FindHao Add multiple ops support for --op argument (#2490)
a1f4b2e8
FindHao Add FusedLinearCrossEntropy (#2485)
dde8528b
atalman Add user release benchmark so that we can run it on pull request (#2489)
bde24013
atalman Install time (#2493)
eae9e50b
xuzhao9 Add Tritonbench CI (#2494)
1ac701f7
jamesjwu Log compile ids to pt2_remote_cache and pt2_compile_events
85c33e5b
mlazos Trace enter/exit of TorchFunctionModes (#135422) (#137114)
79043be1
mlazos Remove ignored modes workaround (#135502) (#137115)
39d65a46
mlazos Handle torch function subclass/mode dispatch on generic tensor method…
4fd7c743
adamomainz adding new configs for servicelab
533d2588
juliagmt-google Improve release benchmark suites with a lower value of epoch (#2482)
7742ef2f
FindHao Check dyno and dcgm existence before disable them (#2496)
dcd3d319
adamomainz combining CI and servicelab configs
3a7a4fea
adamomainz fixing typo in fp8_gemm
b56e2eed
williamwen42 use 3.13 multiline traceback in get_instruction_source_311 (#137617)
f3921ca7
adamomainz differentiating between some Fbsource only targets and OSS for CI
3900904e
XuehaiPan Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` …
f9f52f64
FindHao Add AtenOp Benchmarking (#2495)
34d4f94d
Valentine233 change GPT2ForSequenceClassification inference accuracy tolerance (#1…
680d64ea
adamomainz making CI more flexible for extra data in tritonbench
28d301a4
jovianjaison Add entire _dynamo.config as a json for logging (#137216)
7cb1c0ac
adamomainz sakipping null values in scribe message
509e94f0
ezyang Add fbscribelogger to Dynamo benchmark runner (#137867)
12e1d263
xuzhao9 Update the flash-attention submodule (#2500)
ea4433fd
aakhundov Add host-side Triton TMA support to Dynamo (#137677)
db41e776
FindHao Add ncu report analyzer (#2497)
21cc30dc
FindHao Change default gpu metric backend (#2501)
c3961913
juliagmt-google Update 2.5.0.yaml (#2498)
9e670cd2
FindHao Add --op-collection option (#2503)
58f3b1f3
jasonjk-park Fix imports
2feadb6a
FindHao Add doc for adding custom ops (#2509)
d933cedc
xuzhao9 Fix the broken gemm test
384a43d8
xuzhao9 Test backward pass in unit test.
eec86128
xuzhao9 Make sure all ci-enabled impls are in the output
00c9b9ed
anijain2305 Update AOTEagerandRecordGraphs backend (#138231)
04f0e6cc
masnesral Log is_forward field to dynamo_compile scuba table (#2511)
e89c1b37
jamesjwu Revamp PT2 Compile/chromium event logging [1/?]
8358f921
huydhn Revert D64438144: Log is_forward field to dynamo_compile scuba table
f7dc0c7c
adamomainz adding aggregates to servicelab
0a9cd8f0
adamomainz specifying logged benchmark name for tritonBench servicelab logging
e737b8fe
igorsugak replace uses of np.ndarray with npt.NDArray
06e35fc5
mlazos Disable torch function compilation during guard execution and in comp…
05620407
adamomainz fixing key error in aggregate data
a21b30e4
rec Replace __str__ with __repr__ in some places (#136316)
173774d1
wdvr Update requirements.txt (#2523)
a45e0dbf
mikaylagawarecki Fixes to prep for weights_only default flip (#2514)
fb590d99
aorenste typing compile_fx.py (#138033)
11543183
jamesjwu Add metadata to events in progress, new `dynamo` event
8fce9c12
masnesral Log is_forward field to dynamo_compile scuba table (#138505)
e57bbe23
xmfan Compiled autograd configs in TLS (#137821)
0e038319
xmfan tls access helpers (#138061)
405ba75b
adamomainz adding fp32 strict and tf32x3 benchmarks for gemm
036012ff
anijain2305 Support range_iterator as a function input (#138657)
367b6ef3
anijain2305 Support overridden __call__ on nn modules (#138619)
b5b342ba
adamomainz updating hardware and device columns
3245fde9
atalman Release 2.5.1.yaml perf test (#2525)
47ba1ed8
mikaylagawarecki Account for older numpy versions in #2514 (#2524)
4f30c497
adamomainz fixing gemm for amd
65e5f686
masnesral Add logger logging for remote fx graph cache get + put (#2512)
2614ca98
pytorch/benchmark:bisection
f6f1249c
pytorch/benchmark:utils
f8a4e518
Skylion007 Update Typeguard to TypeIs for better type inference (#133814)
bd238116
anijain2305 Use guard_manager consistently instead of check_fn (#138896)
34ea1a1c
karthik-man Fix naming for AMD in fp8 rowwise fbgemm
713f8002
xmfan Back out "tls access helpers (#138061)" and Back out "[compiled autog…
47e3138d
ezyang Switch times to us in CompilationMetrics and improvements (#138975)
4ad2712d
williamwen42 add some cpython debugging methods (#138030)
438f82b4
karthik-man Set use_cuda_graphs in fp8_gemm_rowwise
870be9b4
xing-liu Remove hammer/generative_recommenders (#2526)
4d6e0fa0
Fix type for "--iter" flag (#2528)
a0890b09
jamesjwu Add start event metadata to collected metadata for PT2 Compile Events
0c8a0f68
jamesjwu Optimize PT2 Compile Events ingestion and column formats
a66ce044
xuzhao9 Add isolate mode
cc094dfe
laithsakka Classify miss-inplaced tensors in logs.
86a366e2
desertfire Switch OSS dashboard to use aoti_compile_and_package (#139597)
4a42e064
bobrenjc93 Specialize symfloats that flow through is_integer (#139572)
3d3b7bb5
facebook-github-bot facebook-github-bot added cla signed
oulgen Add logging for num_triton_bundles
c64ed1e2
hanli0612 Cleanup tl.constexpr HAS_ATTN_SCALE (#2531)
06d867a2
nmacchioni tune tritonbench gemm
672ee070
nmacchioni cut configs into separate file
779c0278
ydwu4 lift free symbols in example_value when create_graph_input (#138363)
abaca229
juliagmt-google juliagmt-google closed this 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone