Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
bigscience-workshop/Megatron-DeepSpeed
Pull Requests
Commits
rm-duplicate-param-count
LS/alibi
LS/doc
Lucile/add-eval-only-arg
Lucile/delete_unnecessary_brackets
Lucile/useless-parenthesis
add-valid-data
bitfit
bloom-ds-inference-repos2
bloom-inference-meta
bnb-resume-2x
bseval_harness
cc-concurrency
chpt-conversion-fix
ckptavg
cluster_benchmark
consumed_samples_per_valid_dataset
cyclic_valid_dataloaders
debug_with_new_dataset
dependabot/pip/black-24.3.0
ds_ckpt_reshape-with-layer-norm-auto-sync
ds-version-check
fix-sample-ids
fp32-checkpoint-extraction
gpu-direct
hadyelsahar/main
launch-debug
license
log-grad-norm
lumi_eval
lumi_mtf
main
master
megatron-2.4-ds-pipe
mtf_p3
mtf-multival
new-dataset
no-shuffling-option
nozero_reshape
olruwase/ds_ckpt_reshape
olruwase/sync_layer_norms
prefixbseval
preprocess_from_HF_dataset
rm-duplicate-param-count
samson/spm
scratchpad
self_attention_stable_corby
skip-broken-tests
sync4
t0loading
test-conversion
thomas/add_shared_t5
thomas/evaluate_gpt_on_prefix_lm_loss
thomas/evaluate_gpt_speed_if_we_pass_attention_mask
thomas/fix_installation
thomas/fix_layer_norm
thomas/improve_test_to_test_custom_kernel
thomas/mlm_train_script
thomas/opt
thomas/test_different_layer_norm
tp-ln-debug
tr1-13B
tr8-104B
train-no-eval-restart
training_flos_rebase
training_flos
universal_ckpt_info
universal_to_fp32_checkpoint
val_args
refactor: use set for constant time lookup
jaketae
committed
4 years ago
ac3e138b
refactor: replace filter w/ list comp, generator to list
jaketae
committed
4 years ago
a7b10b7c
fix: use deepspeed param count method
jaketae
committed
4 years ago
f4c7c67e
Update megatron/training.py
stas00
committed
4 years ago
Verified
c2d63903
Update megatron/training.py
jaketae
committed
4 years ago
Verified
816b8670
refactor: compute model param count once
Jake Tae
committed
4 years ago
544108dc
param size printing revamp (#202)
stas00
committed
4 years ago
Verified
fd1e1da9
[WIP] dealing with multi-process noise (#193)
stas00
committed
4 years ago
Verified
5d3150c7
elastic launcher compatible init_process_group (#201)
stas00
committed
4 years ago
Verified
df5a5c4e
add missing space (#200)
SaulLu
committed
4 years ago
Verified
3487f315
[CI] fix which tests get run (#199)
stas00
committed
4 years ago
Verified
e9558293
Delete unnecessary brackets (#197)
SaulLu
committed
4 years ago
Verified
aa7fa931
Add eval-only arg (#188)
SaulLu
committed
4 years ago
Verified
05851eb5
removed regular package for megatron model (#192)
stas00
committed
4 years ago
Verified
4989bcde
add layernorm in Embedding (#191)
stas00
committed
4 years ago
Verified
b1359ee0
Support skip iteration flag (#177)
jaketae
committed
4 years ago
Verified
106a9a6f
Full seqlen eval for CL+PP (#187)
conglongli
committed
4 years ago
Verified
2105ff38
[BNB] integrate `StableEmbeding` into `VocabParallelEmbedding` logic (#182)
stas00
committed
4 years ago
Verified
a34ca7f2
[PrefixLM] Figuring out why prefix lm is doing poorly on short context (#169)
thomasw21
committed
4 years ago
Verified
6d146b5f
[CI] improvements (#185)
stas00
committed
4 years ago
Verified
590f3e27
Alternative fix to TP > 1 (#178)
thomasw21
committed
4 years ago
Verified
2d9744f2
Fixed TP > 1 issue with new validation scheme
TevenLeScao
committed
4 years ago
b982e040
simplifying tests
TevenLeScao
committed
4 years ago
b1fc4927
Fixed merge oversight in tensorboard logs
TevenLeScao
committed
4 years ago
18b704e1
Adding language specific validation sets for Multilingual model training (#97)
hadyelsahar
committed
4 years ago
Verified
846c0879
Fix prefix lm offsets (#167)
thomasw21
committed
4 years ago
Verified
5e1f2101
Update main.yml (#172)
stas00
committed
4 years ago
Verified
5df29b5d
[CI] fix ci / update packages (#170)
stas00
committed
4 years ago
Verified
b39dd1cf
[checkpoint] only one latest file (#164)
stas00
committed
4 years ago
Verified
45dcd528
Fix curriculum learning doc (#162)
conglongli
committed
4 years ago
Verified
07533d03
Older