[Perf] add packed recurrent fast path for decode #36596
fla: add packed recurrent decode fast path
3036a745
tests: fix packed recurrent decode reference call
2e7b9407
style: ruff format
9bdba19d
Merge branch 'main' into perf/gdn-packed
cafc0329
gdn: address review feedback
297a3f8b
gdn: move decode path routing into forward core
6ba4d35d
refactor: inline baseline logic in forward core
1d4dafa7
ZJY0516
approved these changes
on 2026-03-12
Merge branch 'main' into perf/gdn-packed
d95cfdf3
ywang96
approved these changes
on 2026-03-12
Merge branch 'main' into perf/gdn-packed
e17186aa
mgoin
approved these changes
on 2026-03-12
vllm-bot
merged
9e19f833
into main 59 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub