vllm
1e2ce5d1
- offload prompt_embeds decode in render_prompts_async to avoid blocking (#43792)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 day ago
offload prompt_embeds decode in render_prompts_async to avoid blocking (#43792) Signed-off-by: Gagan Dhakrey <gagandhakrey@gmail.com>
References
#43792 - offload prompt_embeds decode in render_prompts_async to avoid blocking
Author
gagandhakrey
Parents
559d6710
Loading