vllm
1e2ce5d1 - offload prompt_embeds decode in render_prompts_async to avoid blocking (#43792)

Commit
1 day ago
offload prompt_embeds decode in render_prompts_async to avoid blocking (#43792) Signed-off-by: Gagan Dhakrey <gagandhakrey@gmail.com>
Author
Parents
Loading