transformers
[`MPT`] Add MosaicML's `MPT` model to transformers
#24629
Merged

[`MPT`] Add MosaicML's `MPT` model to transformers #24629

ArthurZucker merged 83 commits into huggingface:main from ArthurZucker:add-mpt
ArthurZucker
ArthurZucker draft add new model like
98e49f6c
ArthurZucker Merge branch 'main' of https://github.com/huggingface/transformers in…
6f621734
ArthurZucker some cleaning of the config
f1ec5c12
ArthurZucker nits
02c95d2d
ArthurZucker Merge branch 'main' of https://github.com/huggingface/transformers in…
9130b093
ArthurZucker add nested configs
5012bdde
ArthurZucker nits
8966585d
ArthurZucker update
28f9a289
ArthurZucker update
66a84cf4
younesbelkada added layer norms + triton kernels
74c9e9a0
younesbelkada consider only LPLayerNorm for now.
8629ec8e
ArthurZucker update
3f407bb1
younesbelkada all keys match.
83ed0a84
ArthurZucker Update
2fa38b7a
ArthurZucker Merge branch 'add-mpt' of https://github.com/ArthurZucker/transformer…
a7dbe54d
younesbelkada fixing nits here and there
e4929ede
younesbelkada working forward pass.
c406e541
younesbelkada removed einops dependency
2b26633e
ArthurZucker
ArthurZucker commented on 2023-07-10
younesbelkada nits
4ca77bda
younesbelkada Merge branch 'main' into add-mpt
33e68150
younesbelkada format
e8885908
younesbelkada add alibi
ac33351a
younesbelkada byebye head mask
15acc436
ArthurZucker refactor attention
a4529049
ArthurZucker Merge branch 'add-mpt' of https://github.com/ArthurZucker/transformer…
d0f558e3
younesbelkada nits.
63b513d8
younesbelkada format
ff7c1264
ArthurZucker
ArthurZucker commented on 2023-07-12
younesbelkada fix nits.
bcd989cd
ArthurZucker nuke ande updates
cf40fc04
ArthurZucker nuke tokenizer test
5579ae27
ArthurZucker don't reshape query with kv heads
85c7fdff
younesbelkada added a bit of documentation.
916e1f2a
younesbelkada remove unneeded things
4cc6748e
ArthurZucker nuke more stuff
b484c7e7
ArthurZucker Merge branch 'add-mpt' of https://github.com/ArthurZucker/transformer…
50f37293
younesbelkada nit
72136e06
younesbelkada logits match - same generations
568586d6
younesbelkada rm unneeded methods
72a610c8
younesbelkada 1 remaining failing CI test
a0fd1c41
younesbelkada nit
32889035
younesbelkada fix nits
bae13683
younesbelkada fix docs
9583f4b1
younesbelkada fix docs
7b1f5f3c
younesbelkada rm tokenizer
4a464629
younesbelkada Merge remote-tracking branch 'upstream/main' into HEAD
a4eac7a8
younesbelkada fixup
8156818e
younesbelkada fixup
c9ed4673
HuggingFaceDocBuilderDev
younesbelkada fixup and fix tests
27fc9990
younesbelkada Merge branch 'main' into add-mpt
dab330cd
younesbelkada fixed configuration object.
08cdae10
younesbelkada use correct activation
2df51fde
younesbelkada few minor fixes
0206c7b0
younesbelkada clarify docs a bit
ed8975a9
younesbelkada logits match à 1e-12
35a0e195
younesbelkada skip and unskip a test
34dc6767
younesbelkada added some slow tests.
609ce142
ArthurZucker
ArthurZucker commented on 2023-07-20
younesbelkada fix readme
c3ee5452
younesbelkada add more details
cf107704
younesbelkada Update docs/source/en/model_doc/mpt.md
737bbd39
younesbelkada Apply suggestions from code review
93b47b05
younesbelkada fix configuration issues
58a6aa84
younesbelkada more fixes in config
ae29aaf7
younesbelkada added more models
732c4564
younesbelkada Apply suggestions from code review
a432f2d2
younesbelkada remove unneeded position ids
31647d07
younesbelkada fix some comments
296462d0
younesbelkada Apply suggestions from code review
f124dc02
younesbelkada revert suggestion
32d49dda
younesbelkada mpt alibi + added batched generation
d856abbf
younesbelkada Update src/transformers/models/mpt/__init__.py
b73b4771
huggingface huggingface deleted a comment from younesbelkada on 2023-07-20
ArthurZucker
ArthurZucker commented on 2023-07-20
younesbelkada remove init config
79d895ff
younesbelkada Update src/transformers/models/mpt/configuration_mpt.py
ee7e9c5f
younesbelkada fix nit
0893a904
younesbelkada Merge branch 'add-mpt' of https://github.com/ArthurZucker/transformer…
71f0de0e
younesbelkada add another slow test
a42e08d2
younesbelkada Merge remote-tracking branch 'upstream/main' into HEAD
b698d14e
younesbelkada younesbelkada marked this pull request as ready for review 2 years ago
younesbelkada younesbelkada requested a review from sgugger sgugger 2 years ago
younesbelkada
Narsil
sgugger
sgugger approved these changes on 2023-07-20
younesbelkada Apply suggestions from code review
21d7819f
younesbelkada fits in one line
486b7f5e
younesbelkada some refactor because make fixup doesn't pass
d1c2a61b
younesbelkada add ft notebook
575987ad
ArthurZucker Merge branch 'main' of https://github.com/huggingface/transformers in…
cb714afe
ArthurZucker update md
1f8913aa
ArthurZucker correct doc path
31de6917
ArthurZucker ArthurZucker merged dcb183f4 into main 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone