DeepSpeed
72608904
- reduce cpu host overhead when using moe (#5578)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
reduce cpu host overhead when using moe (#5578) The operation `.to('cpu') `is not necessary for exp_counts, and it will cause device to host synchronization which damage performance. Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
References
#5578 - reduce cpu host overhead when using moe
Author
ranzhejiang
Parents
8b191d7c
Loading