Fix an issue when piping attn_logits_soft_cap through in vllm. #8600
Pipes attn_logits_soft_cap through multi_queries_paged_attention
2b980606
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
8106ad2d
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
8802322e
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
9e57ad42
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
18358764
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
68cd431c
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
491dbdb1
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
b8660fed
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
351de895
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
2ce9e2fa
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
19cf3a0f
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
172f9cdd
Fix the signature of paged_attention by marking attn_logits_soft_cap …
633792cf
Merge branch 'pytorch:master' into master
28df218e
fenghuizhang
marked this pull request as ready for review 327 days ago
lsy323
approved these changes
on 2025-01-22
lsy323
merged
5b877beb
into master 327 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub