transformers
Add `LongT5` model
#16792
Merged

Add `LongT5` model #16792

stancld
HuggingFaceDocBuilderDev
stancld stancld changed the title [WIP] Add `LongT5` model Add `LongT5` model 3 years ago
stancld
stancld stancld marked this pull request as ready for review 3 years ago
stancld
stancld commented on 2022-04-19
patrickvonplaten
PhungVanDuy
PhungVanDuy
stancld
patil-suraj
patil-suraj commented on 2022-04-21
marksverdhei
stancld
PhungVanDuy
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
patrickvonplaten
patrickvonplaten commented on 2022-04-22
stancld
sgugger
sgugger approved these changes on 2022-04-22
PhungVanDuy
stancld stancld force pushed from b025187a to 785fb064 3 years ago
stancld
PhungVanDuy
stancld
ibulu
stancld
stancld commented on 2022-04-25
patrickvonplaten
calderma
patrickvonplaten
patil-suraj
patil-suraj commented on 2022-05-17
stancld
patil-suraj
patil-suraj
stancld Initial commit
e4078c29
stancld Make some fixes
52ef7a96
stancld Make PT model full forward pass
87840bdb
stancld Drop TF & Flax implementation, fix copies etc
c5346957
stancld Add Flax model and update some corresponding stuff
478505d4
stancld Drop some TF things
978a5164
stancld Update config and flax local attn
37c44945
stancld Add encoder_attention_type to config
79669f08
stancld .
96bfb6b0
stancld Update docs
7e38092a
stancld Do some cleansing
93378287
stancld Fix some issues -> make style; add some docs
d0d4043c
stancld Fix position_bias + mask addition + Update tests
23f115b2
stancld Fix repo consistency
a407e846
stancld Fix model consistency by removing flax operation over attn_mask
a8c7940d
stancld [WIP] Add PT TGlobal LongT5
48b85cf2
stancld .
592590cd
stancld [WIP] Add flax tglobal model
7b8332c7
stancld [WIP] Update flax model to use the right attention type in the encoder
7c1f3786
stancld Fix flax tglobal model forward pass
dec32c6f
stancld Make the use of global_relative_attention_bias
9dc07a1a
stancld Add test suites for TGlobal model
fddb3268
stancld Fix minor bugs, clean code
7488044e
stancld Fix pt-flax equivalence though not convinced with correctness
e991707f
stancld Fix LocalAttn implementation to match the original impl. + update REA…
efd24451
stancld Few updates
619595ba
stancld Update: [Flax] improve large model init and loading #16148
47dc3906
stancld Add ckpt conversion script accoring to #16853 + handle torch device p…
6ba02815
stancld Minor updates to conversion script.
c430df4d
stancld Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM
4598a28e
PhungVanDuy gpu support + dtype fix
93d3982f
stancld Apply some suggestions from code review
82a99c28
stancld * Remove (de)parallelize stuff
8b98746a
stancld Remove caching logic for local & tglobal attention
877c51ca
stancld Apply another batch of suggestions from code review
7cd41051
stancld Fix converting script + revert config file change
d19c593f
stancld Fix LocalAttn implementation to match the original impl. + update REA…
efd24451
stancld Revert "Remove caching logic for local & tglobal attention"
eff2511d
stancld Stash caching logic in Flax model
d41e10ae
stancld Move test files to the proper place
e0b3e7b5
patil-suraj fix _make_global_fixed_block_ids and masked neg value
f33298b3
patil-suraj update flax model
ee2e08ea
patil-suraj style and quality
e29b7b6f
patil-suraj patil-suraj force pushed to e29b7b6f 3 years ago
patil-suraj fix imports
b2f6c809
patil-suraj remove load_tf_weights_in_longt5 from init and fix copies
05b15968
patil-suraj add slow test for TGlobal model
e9696dd3
patil-suraj typo fix
ca92e712
patil-suraj
patil-suraj commented on 2022-05-30
patrickvonplaten
patrickvonplaten
patrickvonplaten commented on 2022-05-30
stancld Merge branch 'main' into new_model/LongT5
085da427
stancld Drop obsolete is_parallelizable and one warning
70276d96
stancld Update __init__ files to fix repo-consistency
6a903e32
patrickvonplaten
patil-suraj
patil-suraj fix pipeline test
b7c68d09
patil-suraj
patil-suraj commented on 2022-06-03
stancld Fix some device placements
90857ce0
stancld Merge branch 'main' into new_model/LongT5
bdef4d87
patil-suraj
patil-suraj approved these changes on 2022-06-09
patil-suraj patil-suraj requested a review from patrickvonplaten patrickvonplaten 3 years ago
stancld Merge branch 'main' into new_model/LongT5
b2a6ae2a
stancld [wip]: Update tests -- need to generate summaries to update expected_…
9a043798
stancld Fix quality
ac8ac232
stancld Update LongT5 model card
a3717489
stancld Update (slow) summarization tests
7c812266
stancld make style
9a3b2818
patrickvonplaten rename checkpoitns
eb15125e
patrickvonplaten Merge branch 'main' of https://github.com/huggingface/transformers in…
1163d5d8
patrickvonplaten finish
832b3d8c
patrickvonplaten fix flax tests
7aac4313
stancld Merge branch 'main' into new_model/LongT5
b6b38bde
patrickvonplaten patrickvonplaten merged a72f1c9f into main 3 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone