DeepSpeed
1b08325d
- [TiledMLP] moe support (#7622)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
80 days ago
[TiledMLP] moe support (#7622) MoE routers seem to drop the `bs` dimension in `x` so the `[bs, seqlen, hidden_size]` is no longer expected. support that use-case. Signed-off-by: Stas Bekman <stas@stason.org>
References
#7622 - [TiledMLP] moe support
Author
stas00
Parents
1ae1cdd8
Loading