[MISC] Fix Tensor Parallelism for Quantized Mamba Models with n_groups=1 #33257
fix tp>1 for quntized mamba models
88743733
cursor
commented
on 2026-01-28
fix
b3878efe
Unify MambaMixer2 TP sharding to use custom weight loader
d5d6d0b8
vadiklyutiy
deleted the vadim/fix-falcon-fp8-tp branch 78 days ago
Login to write a write a comment.
Login via GitHub