Single commit
adding template
update model
model update
update conf for debug model
update conversion
update conversion script
update conversion script
fix missing keys check
add tests to test the tokenizer in the local machine
Change variable name
add tests on xnli dataset
add more description
add descriptions + clearer code
clearer code
adding new tests + skipping few tests because of env problems
change comment
add dtype on the configuration
add test embeddings
add hardcoded test
fix dtype issue
adding torch.float16 to config
adding more metrics (min, max, mean)
add sum
now the test passes with almost equal
add files for conversion - test passes on cpu gpu
add final changes
cleaning code
add new args in the docstring
fix one liner function
remove macros
remove forward attention
clean up init funtion
add comments on the issue
rm scale mask softmax
do make style
fix dtype in init
fixing for loop on att probs
fix style with black
fix style + doc error
fix and debug CI errors (docs + style)
some updates
- change new operations
- finally add scaled softmax
- added new args in the config
make use cache working
add changes
- save sharded models
- final changes on the modeling script
add changes
- comment on alibi
- add TODO on seq length
test commit
- added a text to test the commit
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>
final changes
- attention mask change
- generation works on BS176b
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>
changes - model + conversion
move to correct dir