fix(granitemoe*): Only create block_sparse_moe if num_local_experts > 0 (#42036)
* fix(granitemoehybid): Only set self.block_sparse_moe if num_local_experts > 0
Branch: GraniteMoeAsDenseFix
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* fix(granitemoehybrid): Regenerate modeling_granitemoehybrid.py
Branch: GraniteMoeAsDenseFix
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* style: Fix import order
Branch: GraniteMoeAsDenseFix
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* make fix-copies
---------
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>