vllm
f7e62e3d - [Bugfix] Fix mismatch between global and local attention heads in tensor-parallel mode for param2moe model (#39707)

Commit
20 days ago
[Bugfix] Fix mismatch between global and local attention heads in tensor-parallel mode for param2moe model (#39707) Signed-off-by: bhargav-patel-29 <bhargav.patel@tihiitb.org> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Parents
Loading