Megatron-DeepSpeed
[debug] ModelInspector
#155
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
25
Changes
View On
GitHub
[debug] ModelInspector
#155
jaketae
wants to merge 25 commits into
main
from
log-grad-norm
feature: add debug grad norm cls
222bb0b3
jaketae
marked this pull request as draft
4 years ago
jaketae
assigned
jaketae
4 years ago
jaketae
changed the title
Gradient norm logger tool
Gradient norm logger
4 years ago
fix: use named_modules, add min max logging
eabb8c5a
fix: grad_outputs instead of inputs, iterate over tuple
4f42e591
fix: check if `grad_output` is `None`
dd3bef26
fix: clone input for backward hook
daa2b630
jaketae
marked this pull request as ready for review
4 years ago
jaketae
requested a review
from
stas00
4 years ago
make it work with any torch version
f301ffd9
docs: add comment on why `.clone()` was added
1cce6594
chore: clean up print format
934c0b30
experimental forward tb logger
e4445dab
Merge remote-tracking branch 'origin/main' into log-grad-norm
ab0f78ed
rework to integrate real iterations
6b11ad70
stas00
removed review request
from
stas00
4 years ago
reworking the state dumper
5edefea4
update
11ef669e
Merge remote-tracking branch 'origin/main' into log-grad-norm
f1c9ff89
Merge remote-tracking branch 'origin/main' into log-grad-norm
98e153f3
Merge remote-tracking branch 'origin/main' into log-grad-norm
cc6d66a8
Merge remote-tracking branch 'origin/main' into log-grad-norm
114c473f
Merge remote-tracking branch 'origin/main' into log-grad-norm
2c9c6655
switch to pp_rank as a global unique identifier
0aeab4f8
skip params on bwd; add argmin/argmax
349c45c7
rename to an easier to remember ModuleInspector
5aa8c248
rename to an easier to remember ModelInspector
f135a172
stas00
changed the title
Gradient norm logger
[debug] ModelInspector
4 years ago
Merge remote-tracking branch 'origin/main' into log-grad-norm
3dd99121
Merge remote-tracking branch 'origin/main' into log-grad-norm
20c9bfb3
Merge remote-tracking branch 'origin/main' into log-grad-norm
f689231c
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
jaketae
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub