benchmark
update
#2533
Closed
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
97
Changes
View On
GitHub
update
#2533
juliagmt-google
wants to merge 97 commits into
juliagmt/test
from
main
Revert "Trace enter/exit of TorchFunctionModes (#135422)" (#136590)
a31c3fe9
Add nsys integration
2edf80cb
Fix bug #2458 (#2459)
0f050151
Restore FlexAttention and FlashV3 backward (#2473)
611bf702
Fix hardcoded shape in low_mem_dropout benchmark (#2475)
252a3b17
Make FA3 work in fbcode
b6b67a4f
Skip loading triton.nvidia.cublas if not found
0611c41c
Print TMA benchmark info to stderr
0cb1e96d
Modernize cutlass call for fp8 blockwise
2d9ab0b1
CSV of extra shapes for gemm benchmarks
d512e673
Add layout options to gemm
4445aa2b
Enable fp8 rowwise on AMDGPU (#2483)
f2932b74
Ignore Torchbench CI on Tritonbench paths (#2481)
a8ce4b5a
Add _dynamo.config inline_inbuilt_nn_modules and specialize_float log…
737084ec
Add non-persistent fp8 triton_rowwise kernel (#2484)
6b4f3393
Bump transformer version (#2488)
12820bcc
Add multiple ops support for --op argument (#2490)
a1f4b2e8
Add FusedLinearCrossEntropy (#2485)
dde8528b
Add user release benchmark so that we can run it on pull request (#2489)
bde24013
Install time (#2493)
eae9e50b
Add Tritonbench CI (#2494)
1ac701f7
Log compile ids to pt2_remote_cache and pt2_compile_events
85c33e5b
Trace enter/exit of TorchFunctionModes (#135422) (#137114)
79043be1
Remove ignored modes workaround (#135502) (#137115)
39d65a46
Handle torch function subclass/mode dispatch on generic tensor method…
4fd7c743
adding new configs for servicelab
533d2588
Improve release benchmark suites with a lower value of epoch (#2482)
7742ef2f
Check dyno and dcgm existence before disable them (#2496)
dcd3d319
combining CI and servicelab configs
3a7a4fea
fixing typo in fp8_gemm
b56e2eed
use 3.13 multiline traceback in get_instruction_source_311 (#137617)
f3921ca7
differentiating between some Fbsource only targets and OSS for CI
3900904e
Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` …
f9f52f64
Add AtenOp Benchmarking (#2495)
34d4f94d
change GPT2ForSequenceClassification inference accuracy tolerance (#1…
680d64ea
making CI more flexible for extra data in tritonbench
28d301a4
Add entire _dynamo.config as a json for logging (#137216)
7cb1c0ac
sakipping null values in scribe message
509e94f0
Add fbscribelogger to Dynamo benchmark runner (#137867)
12e1d263
Update the flash-attention submodule (#2500)
ea4433fd
Add host-side Triton TMA support to Dynamo (#137677)
db41e776
Add ncu report analyzer (#2497)
21cc30dc
Change default gpu metric backend (#2501)
c3961913
Update 2.5.0.yaml (#2498)
9e670cd2
Add --op-collection option (#2503)
58f3b1f3
Fix imports
2feadb6a
Add doc for adding custom ops (#2509)
d933cedc
Fix the broken gemm test
384a43d8
Test backward pass in unit test.
eec86128
Make sure all ci-enabled impls are in the output
00c9b9ed
Update AOTEagerandRecordGraphs backend (#138231)
04f0e6cc
Log is_forward field to dynamo_compile scuba table (#2511)
e89c1b37
Revamp PT2 Compile/chromium event logging [1/?]
8358f921
Revert D64438144: Log is_forward field to dynamo_compile scuba table
f7dc0c7c
adding aggregates to servicelab
0a9cd8f0
specifying logged benchmark name for tritonBench servicelab logging
e737b8fe
replace uses of np.ndarray with npt.NDArray
06e35fc5
Disable torch function compilation during guard execution and in comp…
05620407
fixing key error in aggregate data
a21b30e4
Replace __str__ with __repr__ in some places (#136316)
173774d1
Update requirements.txt (#2523)
a45e0dbf
Fixes to prep for weights_only default flip (#2514)
fb590d99
typing compile_fx.py (#138033)
11543183
Add metadata to events in progress, new `dynamo` event
8fce9c12
Log is_forward field to dynamo_compile scuba table (#138505)
e57bbe23
Compiled autograd configs in TLS (#137821)
0e038319
tls access helpers (#138061)
405ba75b
adding fp32 strict and tf32x3 benchmarks for gemm
036012ff
Support range_iterator as a function input (#138657)
367b6ef3
Support overridden __call__ on nn modules (#138619)
b5b342ba
updating hardware and device columns
3245fde9
Release 2.5.1.yaml perf test (#2525)
47ba1ed8
Account for older numpy versions in #2514 (#2524)
4f30c497
fixing gemm for amd
65e5f686
Add logger logging for remote fx graph cache get + put (#2512)
2614ca98
pytorch/benchmark:bisection
f6f1249c
pytorch/benchmark:utils
f8a4e518
Update Typeguard to TypeIs for better type inference (#133814)
bd238116
Use guard_manager consistently instead of check_fn (#138896)
34ea1a1c
Fix naming for AMD in fp8 rowwise fbgemm
713f8002
Back out "tls access helpers (#138061)" and Back out "[compiled autog…
47e3138d
Switch times to us in CompilationMetrics and improvements (#138975)
4ad2712d
add some cpython debugging methods (#138030)
438f82b4
Set use_cuda_graphs in fp8_gemm_rowwise
870be9b4
Remove hammer/generative_recommenders (#2526)
4d6e0fa0
Fix type for "--iter" flag (#2528)
a0890b09
Add start event metadata to collected metadata for PT2 Compile Events
0c8a0f68
Optimize PT2 Compile Events ingestion and column formats
a66ce044
Add isolate mode
cc094dfe
Classify miss-inplaced tensors in logs.
86a366e2
Switch OSS dashboard to use aoti_compile_and_package (#139597)
4a42e064
Specialize symfloats that flow through is_integer (#139572)
3d3b7bb5
facebook-github-bot
added
cla signed
Add logging for num_triton_bundles
c64ed1e2
Cleanup tl.constexpr HAS_ATTN_SCALE (#2531)
06d867a2
tune tritonbench gemm
672ee070
cut configs into separate file
779c0278
lift free symbols in example_value when create_graph_input (#138363)
abaca229
juliagmt-google
closed this
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
No one assigned
Labels
cla signed
Milestone
No milestone
Login to write a write a comment.
Login via GitHub