DeepSpeed
9a8f6a1d - Account for expert parameters when calculating the total number of parameters in the model (#3720)

Commit
2 years ago
Account for expert parameters when calculating the total number of parameters in the model (#3720) Co-authored-by: Alex Dubrovsky <dubro@amazon.com>
Author
Parents
Loading