vllm
[Kernel] Triton-based Top-k and Top-p sampler kernels
#33538
Merged

[Kernel] Triton-based Top-k and Top-p sampler kernels #33538

njhill merged 118 commits into vllm-project:main from cakeng:triton-topk-topp
cakeng
cakeng Attempt 1
a5fc250e
cakeng Top k works?
c95041b3
cakeng Top k works?
fe60b223
cakeng Tenary search
74a18b5a
cakeng Quadruple Search
7502c064
cakeng Quadruple Search
360e2343
cakeng Added outliers
11bd61fc
cakeng Added gather
a922b45e
cakeng Added gather
6f39f209
cakeng 0.00115 for topk
30033c22
cakeng 0.00115 for topk
29876175
cakeng topk working, adding topp:
ba5b98b5
cakeng Wrong results
46bcc7df
cakeng Wrong results
5de5ece6
cakeng Fixed?
cbcf7f52
cakeng Fixed?
643c21d4
cakeng Maybe?
2737c2d7
cakeng Duplicate logit issues.
f24d2e17
cakeng Duplicate logit issues.
a58ca6cf
cakeng Top-p duplicate handler implemented
b87c0954
cakeng Top-p fixed
6e3ca0a3
cakeng Need to implement topp-only, topk and topk-topp works.
034e8024
cakeng Correctness tested for top-p. Duplication handling for top-k remaining.
11145820
cakeng Deeseep tests
56a615f0
cakeng Added env var VLLM_USE_TRITON_SAMPLER and automated test
6bea89cd
cakeng Merge remote-tracking branch 'origin/main' into topk_topp
8309b68e
cakeng Linter
5575c676
cakeng Tests
3342235a
cakeng Added Triton autotune
9bb0fbbc
cakeng Reduce diff and do fallback when batch size small.
340b6b46
cakeng Merge remote-tracking branch 'origin/main' into topk_topp
54df27fa
cakeng Test script fix
cf768c22
cakeng Added graph generation
9b3cf75a
cakeng Removed fallback
4235295a
cakeng Merge branch 'vllm-project:main' into topk_topp
1e3ed757
cakeng Added Gemini's suggestions, removed triton autotune.
344c3e4a
cakeng Merge branch 'topk_topp' of https://github.com/cakeng/vllm into topk_…
ba89c384
cakeng Fixed warps and stages
da1b1e6f
cakeng Fixed scratchpads
289c2ba8
cakeng Fixed scratchpads
5b0b1e6e
cakeng Merge branch 'main' into topk_topp
865b5230
cakeng Merge branch 'main' into topk_topp
350cbc8a
ddsoup0401 Init Sunga's correct triton top_k top_p implementation
5e6156cf
ddsoup0401 initial commit
7401ead1
ddsoup0401 init commit
d8fac6a2
ddsoup0401 not working.........
1d349d31
ddsoup0401 working on it....
b9a0c053
ddsoup0401 working........python compare.py
71c59786
ddsoup0401 ...
115a98b3
ddsoup0401 ...
f9b08f22
ddsoup0401 slow but working
b8728dbc
ddsoup0401 very slow
953025e0
ddsoup0401 pushed?
d1ca674f
cakeng Top-k working
5697d83e
cakeng Errors on top p
a2f6ae61
cakeng Everything correct but slow
2893ed54
cakeng Everything correct but slow
6e3c8744
cakeng Fast and working correctly
d0f491ee
cakeng Fast and working correctly
6743e12d
cakeng Errors
60b6515a
cakeng Filtered logits are wrongs
71cbb9ef
cakeng Filtered logits are wrongs
8b0771ce
cakeng Floating point associativity errors remain
20806a27
cakeng Merge main
f8cc4535
cakeng Remove tester
89443c01
cakeng Bugfix
204c221f
cakeng Test file removed.
e262fcb5
cakeng Typos
5e6dc79f
cakeng Typos
091b5188
cakeng Typos
5fc986e2
cakeng Typos
02d446bb
cakeng Typos
d0f02f6c
cakeng Bugfixes
152bc320
cakeng Deduplication
db9859f5
cakeng Duplication search bugfix
b936c94d
cakeng Bugfixes
3784e603
cakeng PyTorch sort permutes the order of duplicate values when sorting. Whe…
b0b6253c
cakeng Original pytorch implemntation apply softmax after sorting, which pro…
cd98ab90
cakeng Helper scripts
b72e2076
cakeng Helper scripts removed
b1152c15
cakeng Change hyperparameters
d2d56a12
cakeng Merge main
6421e1e9
njhill [Perf] Triton-based top-p/top-k masking
7643eabd
njhill fix doc
5a241a69
njhill fix method name, only use triton when supported
b017713d
njhill Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
bd5d2413
njhill fix precision
e067cbfd
njhill Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
a02aee88
cakeng Merge commit 'refs/pull/32558/head' of https://github.com/vllm-projec…
fbeb15f7
cakeng Copied topk + topp impl
463afa65
cakeng Copied topk + topp impl
9a5f30d7
cakeng Topp wrong
65874cce
cakeng Topp working, topp only
a671a098
cakeng Both Topk Topp working
cf6ab55a
cakeng Restored tests
150ccc6d
cakeng Bugfix
ae08705a
cakeng Loosened hyperparameters
49c3c39b
cakeng Linter
06565dfd
cakeng Restore
acd99d71
cakeng Merge branch 'main' into triton-topk-topp
ca3fff65
cakeng Merge branch 'topk_topp' into triton-topk-topp
b9d2275f
cakeng cakeng requested a review from 22quinn 22quinn 62 days ago
cakeng cakeng requested a review from houseroad houseroad 62 days ago
cakeng cakeng requested a review from njhill njhill 62 days ago
mergify mergify added performance
mergify mergify added v1
gemini-code-assist
gemini-code-assist commented on 2026-02-02
cakeng Update vllm/v1/sample/ops/topk_topp_triton.py
37f322a6
cakeng Bugfix
cb731c57
njhill
cakeng Refactor comments for clarity in topk_topp_triton.py
0c61b95c
mergify
cakeng Pre-commit fix
503f0b0f
cakeng Update arxiv
576f90eb
njhill adjust prob distribution in benchmark, adjust threshold
c18fe71e
njhill Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
dba83d52
njhill some simplification/cleanup
c246c3a7
mergify
njhill fix precommit
4360e923
njhill
njhill approved these changes on 2026-02-13
njhill njhill added ready
njhill Merge branch 'main' into triton-topk-topp
612f38ff
njhill njhill enabled auto-merge (squash) 51 days ago
njhill Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
47ef82d7
njhill fix -inf edge cases and possible infinite loop
b917a495
njhill Merge remote-tracking branch 'origin/main' into triton-topk-topp
ef5d06e2
njhill add async yield in cancellation test
9dadec16
njhill Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
8e027560
njhill njhill requested a review from DarkLight1337 DarkLight1337 47 days ago
njhill njhill requested a review from robertgshaw2-redhat robertgshaw2-redhat 47 days ago
njhill njhill requested a review from aarnphm aarnphm 47 days ago
njhill njhill requested a review from NickLucche NickLucche 47 days ago
njhill use temperature=0 in cancellation test
0e46d904
njhill Merge remote-tracking branch 'origin/main' into triton-topk-topp
f7873c5f
njhill njhill merged c656ba3b into main 46 days ago
grimulkan

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone