DeepSpeed
Tensor parallelism for Mixture of Experts
#2074
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
55
Changes
View On
GitHub
Commits
add tensor parallelism support for non-expert groups
siddharth9820
committed
3 years ago
non-expert tensor parallelism - drop tokens before a2a
siddharth9820
committed
3 years ago
support tensor parallelism for non-experts
siddharth9820
committed
3 years ago
fix formatting
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
migrate code for dropping tokens from megatron
siddharth9820
committed
3 years ago
change gather function name
siddharth9820
committed
3 years ago
fall back to previous error message
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
formatting changes
siddharth9820
committed
3 years ago
change function names
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
tjruwase
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
fix number of local experts
siddharth9820
committed
3 years ago
Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpeed into moe-tensor-parallelism
siddharth9820
committed
3 years ago
fix documentation
siddharth9820
committed
3 years ago
correct log statement
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
roll back ep-size setting code
siddharth9820
committed
3 years ago
Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpeed into moe-tensor-parallelism
siddharth9820
committed
3 years ago
add detailed comments
siddharth9820
committed
3 years ago
restore function in groupy.py
siddharth9820
committed
3 years ago
better comments
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
remove code that changes ep_size and convert it to asserts
siddharth9820
committed
3 years ago
correct groups
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
correction
siddharth9820
committed
3 years ago
add copyright
siddharth9820
committed
3 years ago
correction
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
awan-10
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
formatting changes
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
tjruwase
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
tjruwase
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
add unit tests
siddharth9820
committed
3 years ago
small change
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
siddharth9820
committed
3 years ago
remove amp from tests
siddharth9820
committed
3 years ago
Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpeed into moe-tensor-parallelism
siddharth9820
committed
3 years ago
Merge branch 'master' into moe-tensor-parallelism
tjruwase
committed
3 years ago
Loading