Enabled Qwen2-MoE Tensor Parallelism (TP) inference #6551
Enabled Qwen2-MoE Tensor Parallism (TP) inference
08f728d3
Merge branch 'master' into qwen2-moe
7cff123c
Changed linear filter of qwen2-moe from _replace_module() to _replaceā¦
97f22ff2
Added Qwen2-MoE to the model list of auto_tp
deebfa0d
loadams
approved these changes
on 2024-10-08
Merge branch 'master' into qwen2-moe
932d4b2a
loadams
merged
474a3288
into master 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub