Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
bigscience-workshop/Megatron-DeepSpeed
Pull Requests
Commits
olruwase/sync_layer_norms
LS/alibi
LS/doc
Lucile/add-eval-only-arg
Lucile/delete_unnecessary_brackets
Lucile/useless-parenthesis
add-valid-data
bitfit
bloom-ds-inference-repos2
bloom-inference-meta
bnb-resume-2x
bseval_harness
cc-concurrency
chpt-conversion-fix
ckptavg
cluster_benchmark
consumed_samples_per_valid_dataset
cyclic_valid_dataloaders
debug_with_new_dataset
dependabot/pip/black-24.3.0
ds_ckpt_reshape-with-layer-norm-auto-sync
ds-version-check
fix-sample-ids
fp32-checkpoint-extraction
gpu-direct
hadyelsahar/main
launch-debug
license
log-grad-norm
lumi_eval
lumi_mtf
main
master
megatron-2.4-ds-pipe
mtf_p3
mtf-multival
new-dataset
no-shuffling-option
nozero_reshape
olruwase/ds_ckpt_reshape
olruwase/sync_layer_norms
prefixbseval
preprocess_from_HF_dataset
rm-duplicate-param-count
samson/spm
scratchpad
self_attention_stable_corby
skip-broken-tests
sync4
t0loading
test-conversion
thomas/add_shared_t5
thomas/evaluate_gpt_on_prefix_lm_loss
thomas/evaluate_gpt_speed_if_we_pass_attention_mask
thomas/fix_installation
thomas/fix_layer_norm
thomas/improve_test_to_test_custom_kernel
thomas/mlm_train_script
thomas/opt
thomas/test_different_layer_norm
tp-ln-debug
tr1-13B
tr8-104B
train-no-eval-restart
training_flos_rebase
training_flos
universal_ckpt_info
universal_to_fp32_checkpoint
val_args
Better
thomasw21
committed
3 years ago
07ccb3db
Backward compatibility on torch
thomasw21
committed
3 years ago
f0d6d179
Woops
thomasw21
committed
3 years ago
0cf35ee3
Add embed layer norm
thomasw21
committed
3 years ago
2389bfdf
Woops
thomasw21
committed
3 years ago
8ffb278f
Woops
thomasw21
committed
3 years ago
39d4b8f9
Woops
thomasw21
committed
3 years ago
311e5317
I think bias starts to diverge first
thomasw21
committed
3 years ago
7145f6df
Try to figure out how the divergence happens
thomasw21
committed
3 years ago
a5e32958
Remove assert
thomasw21
committed
3 years ago
6b19339c
Woops
thomasw21
committed
3 years ago
a0c09132
Make test to work with both bf16 and fp16 to see who fails
thomasw21
committed
3 years ago
5fbe1072
Woops
thomasw21
committed
3 years ago
a4172bf9
Launch bf16
thomasw21
committed
3 years ago
7f2441ed
Have high LR to see weights actually change
thomasw21
committed
3 years ago
c20c8ba4
Huh
thomasw21
committed
3 years ago
42d6b4e3
Still trying to reproduce
thomasw21
committed
3 years ago
02365d14
Test with alibi
thomasw21
committed
3 years ago
ce02dd16
Woops
thomasw21
committed
3 years ago
f152e487
Woops
thomasw21
committed
3 years ago
1f2f8007
Woops
thomasw21
committed
3 years ago
7fcff06b
WIP
thomasw21
committed
3 years ago
29372806
Woops
thomasw21
committed
3 years ago
1cdcd7de
Wip
thomasw21
committed
3 years ago
240f673e
WIP
thomasw21
committed
3 years ago
8d7a6038
[tensorboard] add rename and remove event tools (#269)
stas00
committed
3 years ago
Verified
affff3d2
[kill switch] fix test (#268)
stas00
committed
3 years ago
Verified
26feccc1
disable samples-per-dataset, steps-per-dataset, tokens-per-dataset (#267)
stas00
committed
3 years ago
Verified
de3a0277
[kill switch] correct sys.exit (#266)
stas00
committed
3 years ago
Verified
1893811e
Sorry, last change was meant to a PR. This reverts commit d0fcf4170def7205426117016d4622c745f33883.
TevenLeScao
committed
3 years ago
497aa1bf
Newer
Older