Megatron-DeepSpeed
Add UL2 data sampling and pretraining
#358
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
122
Changes
View On
GitHub
Commits
Fix `PretrainedFromHF` tokenizer with T5 training
janEbert
committed
3 years ago
Allow passing existing casual attention masks
janEbert
committed
3 years ago
Refactor masked LM sampling style selection
janEbert
committed
3 years ago
Add more masked LM sampling styles
janEbert
committed
3 years ago
Allow Prefix-LM style masked LM
janEbert
committed
3 years ago
Add UL2 pretraining for T5 model
janEbert
committed
3 years ago
Refactor span merging
janEbert
committed
3 years ago
Support UL2 for decoder-only models
janEbert
committed
3 years ago
Unconditionally use safe maximum sequence length
janEbert
committed
3 years ago
Add custom exceptions
janEbert
committed
3 years ago
Error out on too long sequences
janEbert
committed
3 years ago
Remove additional sequence truncation
janEbert
committed
3 years ago
Prefer array-from-list creation
janEbert
committed
3 years ago
Remove redundant imports
janEbert
committed
3 years ago
Fix not inserting prefixes
janEbert
committed
3 years ago
Do not insert `extra_id` tokens for PrefixLM task
janEbert
committed
3 years ago
Document `max_seq_length_dec` argument
janEbert
committed
3 years ago
Skip redundant computations
janEbert
committed
3 years ago
Fix PrefixLM mean location
janEbert
committed
3 years ago
Pad decoder-only inputs to same length
janEbert
committed
3 years ago
Fix decoder-only attention mask shape
janEbert
committed
3 years ago
Document index set selection for PrefixLM masking
janEbert
committed
3 years ago
Fix `max_ngrams` for normal sampling style
janEbert
committed
3 years ago
Do not limit `max_predictions_per_seq`
janEbert
committed
3 years ago
Calculate and use amount of filtered tokens
janEbert
committed
3 years ago
Document normal sampling style
janEbert
committed
3 years ago
Fix PrefixLM possible spans calculation
janEbert
committed
3 years ago
Use binary search for PrefixLM first tail index
janEbert
committed
3 years ago
Calculate n-gram indices lazily
janEbert
committed
3 years ago
Fix code style
janEbert
committed
3 years ago
Prefer list comprehensions
janEbert
committed
3 years ago
Allow recognizing when UL2 is used
janEbert
committed
3 years ago
Support UL2 tokens for all tokenizers
janEbert
committed
3 years ago
Support `<extra_id>` tokens for GPT tokenizer
janEbert
committed
3 years ago
Fix tokenizer vocab access
janEbert
committed
3 years ago
Revert inheriting from `T5Dataset`
janEbert
committed
3 years ago
Fix GPT tokenizer special token handling
janEbert
committed
3 years ago
Do inherit from `torch.utils.data.Dataset`
janEbert
committed
3 years ago
Add whitespace
janEbert
committed
3 years ago
Allow selectively disabling denoiser token
janEbert
committed
3 years ago
Allow not replacing masks with sentinel tokens
janEbert
committed
3 years ago
Support not adding mask tokens in span corruption
janEbert
committed
3 years ago
Fix expected number of added tokens
janEbert
committed
3 years ago
Fix non-masked data
janEbert
committed
3 years ago
Fix unclear wording
janEbert
committed
3 years ago
Adjust code style
janEbert
committed
3 years ago
Fix covered index skipping
janEbert
committed
3 years ago
Prepend objective token before truncating
janEbert
committed
3 years ago
Automatically truncate sequences for decoder-only
janEbert
committed
3 years ago
Fix covered span skipping fix
janEbert
committed
3 years ago
Make `build_index_mappings` public
janEbert
committed
3 years ago
Refactor getting sample
janEbert
committed
3 years ago
Add sample packing to T5 dataset
janEbert
committed
3 years ago
Add sample packing to UL2 dataset
janEbert
committed
3 years ago
Fix typo and comment placement
janEbert
committed
3 years ago
Fix not supplying `--pack-samples` argument
janEbert
committed
3 years ago
Add support for UL2R-style implementation
janEbert
committed
3 years ago
Fix T5 dataset packing
janEbert
committed
3 years ago
Refactor `get_sample` to return a list
janEbert
committed
3 years ago
Fix T5 sample packing
janEbert
committed
3 years ago
Fix UL2 sample packing
janEbert
committed
3 years ago
Refactor samples dict creation
janEbert
committed
3 years ago
Fix desired seq length
janEbert
committed
3 years ago
Fix padding removal
janEbert
committed
3 years ago
Allow repeating UL2 prompt token when packing
janEbert
committed
3 years ago
Allow packing different denoisers together
janEbert
committed
3 years ago
Refactor sample packing functions
janEbert
committed
3 years ago
Repeat prompt by default when packing UL2
janEbert
committed
3 years ago
Support pipelining for decoder-only model
janEbert
committed
3 years ago
Fix GPT tokenizer vocab size query
janEbert
committed
3 years ago
Handle possibly empty list
janEbert
committed
3 years ago
Fix no newline at EOF
janEbert
committed
3 years ago
Allow full prefix Prefix-LM attention sampling
janEbert
committed
3 years ago
Support PrefixLM models
janEbert
committed
3 years ago
Allow setting number of few-shot examples
janEbert
committed
3 years ago
Update task/dataset name
janEbert
committed
3 years ago
Do not remove last token
janEbert
committed
3 years ago
Fix PrefixLM contexts
janEbert
committed
3 years ago
Fix module refactor
janEbert
committed
3 years ago
Fix possible `TypeError`
janEbert
committed
3 years ago
Optionally add prefix tokens
janEbert
committed
3 years ago
Automatically add UL2 tokens
janEbert
committed
3 years ago
Fix context lengths batch chunking
janEbert
committed
3 years ago
Allow different models to be loaded
janEbert
committed
3 years ago
Fix context batch size padding
janEbert
committed
3 years ago
Add xPos embeddings
janEbert
committed
3 years ago
Add optional UL2 normal distribution scaling
janEbert
committed
3 years ago
Allow evaluating encoder-decoder models
janEbert
committed
3 years ago
Fix not passing `scale_normal_std`
janEbert
committed
3 years ago
Add T5-style GLU layers
janEbert
committed
3 years ago
Rename xPos embedding class
janEbert
committed
3 years ago
Integrate xPos embedding
janEbert
committed
3 years ago
Handle xPos embedding
janEbert
committed
3 years ago
Do not use bias for 2nd MLP layer if using T5 GLU
janEbert
committed
3 years ago
Fix T5 GLU constructor arguments
janEbert
committed
3 years ago
Refactor samples dict creation
janEbert
committed
3 years ago
Move callees under caller
janEbert
committed
3 years ago
Handle empty context
janEbert
committed
3 years ago
Handle more possible model types
janEbert
committed
3 years ago
Fix fully truncated contexts with prefix tokens
janEbert
committed
3 years ago
+ more commits ...
Loading