xla
Fix an issue when piping attn_logits_soft_cap through in vllm.
#8600

Merged

Fix an issue when piping attn_logits_soft_cap through in vllm. #8600

lsy323 merged 14 commits into pytorch:master from fenghuizhang:master

Pipes attn_logits_soft_cap through multi_queries_paged_attention

2b980606

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

8106ad2d

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

8802322e

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

9e57ad42

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

18358764

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

68cd431c

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

491dbdb1

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

b8660fed

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

351de895

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

2ce9e2fa

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

19cf3a0f

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

172f9cdd

Fix the signature of paged_attention by marking attn_logits_soft_cap …

633792cf

Merge branch 'pytorch:master' into master

28df218e

fenghuizhang marked this pull request as ready for review 327 days ago

lsy323 approved these changes on 2025-01-22

lsy323 merged 5b877beb into master 327 days ago

Reviewers

lsy323

Assignees

No one assigned

Labels

None yet

Milestone

No milestone