[`MPT`] Add MosaicML's `MPT` model to transformers #24629
draft add new model like
98e49f6c
Merge branch 'main' of https://github.com/huggingface/transformers in…
6f621734
some cleaning of the config
f1ec5c12
nits
02c95d2d
Merge branch 'main' of https://github.com/huggingface/transformers in…
9130b093
add nested configs
5012bdde
nits
8966585d
update
28f9a289
update
66a84cf4
added layer norms + triton kernels
74c9e9a0
consider only LPLayerNorm for now.
8629ec8e
update
3f407bb1
all keys match.
83ed0a84
Update
2fa38b7a
Merge branch 'add-mpt' of https://github.com/ArthurZucker/transformer…
a7dbe54d
fixing nits here and there
e4929ede
working forward pass.
c406e541
removed einops dependency
2b26633e
nits
4ca77bda
Merge branch 'main' into add-mpt
33e68150
format
e8885908
add alibi
ac33351a
byebye head mask
15acc436
refactor attention
a4529049
Merge branch 'add-mpt' of https://github.com/ArthurZucker/transformer…
d0f558e3
nits.
63b513d8
format
ff7c1264
fix nits.
bcd989cd
nuke ande updates
cf40fc04
nuke tokenizer test
5579ae27
don't reshape query with kv heads
85c7fdff
added a bit of documentation.
916e1f2a
remove unneeded things
4cc6748e
nuke more stuff
b484c7e7
Merge branch 'add-mpt' of https://github.com/ArthurZucker/transformer…
50f37293
nit
72136e06
logits match - same generations
568586d6
rm unneeded methods
72a610c8
1 remaining failing CI test
a0fd1c41
nit
32889035
fix nits
bae13683
fix docs
9583f4b1
fix docs
7b1f5f3c
rm tokenizer
4a464629
Merge remote-tracking branch 'upstream/main' into HEAD
a4eac7a8
fixup
8156818e
fixup
c9ed4673
fixup and fix tests
27fc9990
Merge branch 'main' into add-mpt
dab330cd
fixed configuration object.
08cdae10
use correct activation
2df51fde
few minor fixes
0206c7b0
clarify docs a bit
ed8975a9
logits match à 1e-12
35a0e195
skip and unskip a test
34dc6767
added some slow tests.
609ce142
fix readme
c3ee5452
add more details
cf107704
Update docs/source/en/model_doc/mpt.md
737bbd39
Apply suggestions from code review
93b47b05
fix configuration issues
58a6aa84
more fixes in config
ae29aaf7
added more models
732c4564
Apply suggestions from code review
a432f2d2
remove unneeded position ids
31647d07
fix some comments
296462d0
Apply suggestions from code review
f124dc02
revert suggestion
32d49dda
mpt alibi + added batched generation
d856abbf
Update src/transformers/models/mpt/__init__.py
b73b4771
remove init config
79d895ff
Update src/transformers/models/mpt/configuration_mpt.py
ee7e9c5f
fix nit
0893a904
Merge branch 'add-mpt' of https://github.com/ArthurZucker/transformer…
71f0de0e
add another slow test
a42e08d2
Merge remote-tracking branch 'upstream/main' into HEAD
b698d14e
younesbelkada
marked this pull request as ready for review 2 years ago
sgugger
approved these changes
on 2023-07-20
Apply suggestions from code review
21d7819f
fits in one line
486b7f5e
some refactor because make fixup doesn't pass
d1c2a61b
add ft notebook
575987ad
Merge branch 'main' of https://github.com/huggingface/transformers in…
cb714afe
update md
1f8913aa
correct doc path
31de6917
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub