Megatron-DeepSpeed
Compute model param count once
#204
Open

Compute model param count once #204

jaketae wants to merge 6 commits into main from rm-duplicate-param-count
jaketae
refactor: compute model param count once
544108dc
jaketae jaketae marked this pull request as ready for review 4 years ago
jaketae jaketae requested a review from stas00 stas00 4 years ago
stas00
stas00 commented on 2021-11-24
stas00 stas00 requested a review from TevenLeScao TevenLeScao 4 years ago
stas00
jaketae Update megatron/training.py
816b8670
jaketae
jaketae commented on 2021-11-25
stas00 Update megatron/training.py
c2d63903
TevenLeScao
stas00
stas00
jaketae
stas00
jaketae
stas00
jaketae fix: use deepspeed param count method
f4c7c67e
jaketae
jaketae refactor: replace filter w/ list comp, generator to list
a7b10b7c
jaketae refactor: use set for constant time lookup
ac3e138b
stas00

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone