transformers
Terminator strings for generate()
#28932
Merged

Terminator strings for generate() #28932

Rocketknight1 merged 68 commits into main from terminator_strings_for_generate
Rocketknight1
Rocketknight1 Rocketknight1 requested a review from gante gante 2 years ago
Rocketknight1 Rocketknight1 force pushed to 40e4abee 2 years ago
HuggingFaceDocBuilderDev
zucchini-nlp
zucchini-nlp
zucchini-nlp commented on 2024-02-12
Rocketknight1 Rocketknight1 force pushed from c26d4197 to 6a92a310 2 years ago
Rocketknight1 Rocketknight1 requested a review from amyeroberts amyeroberts 2 years ago
Rocketknight1 Rocketknight1 force pushed to 254fa2db 2 years ago
Rocketknight1
amyeroberts
gante
gante commented on 2024-02-14
Rocketknight1
amyeroberts
Rocketknight1
Rocketknight1 Rocketknight1 force pushed to 455259e1 2 years ago
Rocketknight1 Rocketknight1 force pushed from cb74b51d to ba8c7d19 2 years ago
Rocketknight1
Rocketknight1
amyeroberts
Rocketknight1
Rocketknight1 Rocketknight1 force pushed to 25ef2989 2 years ago
Rocketknight1
Rocketknight1 Rocketknight1 force pushed from 25ef2989 to 8c23e391 2 years ago
Rocketknight1
zucchini-nlp
Rocketknight1
Rocketknight1
gante
gante approved these changes on 2024-02-28
amyeroberts
amyeroberts commented on 2024-02-16
Rocketknight1 Rocketknight1 force pushed from 0a108176 to e9519b10 2 years ago
Rocketknight1
Rocketknight1 Rocketknight1 force pushed from 9e60f83f to be68dcac 1 year ago
amyeroberts
amyeroberts commented on 2024-03-22
Rocketknight1 stash commit (will discard all of this)
262537f0
Rocketknight1 stash commit
cfa538b0
Rocketknight1 First commit - needs a lot of testing!
127182a9
Rocketknight1 Add a test
8cd60591
Rocketknight1 Fix imports and make the tests actually test something
5fde7aeb
Rocketknight1 Tests pass!
ff02b0cd
Rocketknight1 Rearrange test
4ce1aba0
Rocketknight1 Add comments (but it's still a bit confusing)
1742b681
Rocketknight1 Stop storing the tokenizer
9fb77e33
Rocketknight1 Comment fixup
667d6d88
Rocketknight1 Fix for input_ids with a single sequence
070a76e8
Rocketknight1 Update tests to test single sequences
4c436f2c
Rocketknight1 make fixup
78b0f247
Rocketknight1 Fix incorrect use of isin()
8ee5762e
Rocketknight1 Expand tests to catch more cases
9f43a2a6
Rocketknight1 Expand tests to catch more cases
f0fa7074
Rocketknight1 make fixup
5bcf5e47
Rocketknight1 Fix length calculation and update tests
8cca9a40
Rocketknight1 Handle Ä  as a space replacement too
ec6f7265
Rocketknight1 Update src/transformers/generation/stopping_criteria.py
0e632c2f
Rocketknight1 Add optimizations from Joao's suggestion
ac1135c2
Rocketknight1 Remove TODO
27318270
Rocketknight1 Update src/transformers/generation/stopping_criteria.py
9213298f
Rocketknight1 Update tests/generation/test_stopping_criteria.py
f48522e1
Rocketknight1 make fixup
7a772b8f
Rocketknight1 Rename some variables and remove some debugging clauses for clarity
c604a2ba
Rocketknight1 Add tests for the sub-methods
7dd346af
Rocketknight1 Clarify one test slightly
641ba727
Rocketknight1 Add stop_strings to GenerationConfig
f6721a5a
Rocketknight1 generate() supports stop_string arg, asks for tokenizer if not provided
8772bcbb
Rocketknight1 make fixup
e423417a
Rocketknight1 Cleanup code and rename variables for clarity
398a799f
Rocketknight1 Update tokenizer error
e3140a68
Rocketknight1 Update tokenizer passing, handle generation on GPU
0008722d
Rocketknight1 Slightly more explanation cleanup
a29c131e
Rocketknight1 More comment cleanup
9c359ffe
Rocketknight1 Factor out the token cleanup so it's more obvious what we're doing, a…
602222dc
Rocketknight1 Careful with that cleanup!
4c7a7777
Rocketknight1 Cleanup + optimizations to _get_matching_positions
b6e01639
Rocketknight1 More minor performance tweaks
43d9e084
Rocketknight1 Implement caching and eliminate some expensive ops (startup time: 200…
60eb5769
Rocketknight1 Remove the pin_memory call
ff422118
Rocketknight1 Parallelize across all stop strings!
ae800a66
Rocketknight1 Quick fix for tensor devices
46c0a9c6
Rocketknight1 Update embeddings test for the new format
b9a066d3
Rocketknight1 Fix test imports
692523c4
Rocketknight1 Manual patching for BERT-like tokenizers
2ba7f8ed
Rocketknight1 Return a bool vector instead of a single True/False
8b95ec15
Rocketknight1 Better comment
1b46b208
Rocketknight1 Better comment
350a850e
Rocketknight1 Add tests from @zucchini-nlp
0b85c6c6
Rocketknight1 Amy's list creation nit
e8c769d2
Rocketknight1 tok_list -> token_list
14de1c3c
Rocketknight1 Push a big expanded docstring (should we put it somewhere else?)
b8961e8d
Rocketknight1 Expand docstrings
7ed55ad2
Rocketknight1 Docstring fixups
cbb9d147
Rocketknight1 Rebase
7db95c1a
Rocketknight1 make fixup
49b0f21e
Rocketknight1 Make a properly general method for figuring out token strings
c9aefe64
Rocketknight1 Fix naming throughout the functions
443cd5d6
Rocketknight1 Move cache, refactor, fix tests
e1c9c0e0
Rocketknight1 Add comment
f49ec00b
Rocketknight1 Remove finished TODO
e90aaba5
Rocketknight1 Remove finished TODO
bb27d82e
Rocketknight1 make fixup
43170197
Rocketknight1 Update src/transformers/generation/stopping_criteria.py
19df6a82
Rocketknight1 Update and shorten docstring
8b520391
Rocketknight1 Make a properly general method for figuring out token strings
c9aefe64
Rocketknight1 Move cache, refactor, fix tests
e1c9c0e0
Rocketknight1 Add comment
f49ec00b
Rocketknight1 Remove finished TODO
e90aaba5
Rocketknight1 Remove finished TODO
bb27d82e
Rocketknight1 Rocketknight1 force pushed to 8b520391 1 year ago
Rocketknight1 Update tests to be shorter/clearer and test specific cases
0aa201cb
amyeroberts
amyeroberts approved these changes on 2024-04-12
Rocketknight1 Rocketknight1 merged 0d84901c into main 1 year ago
Rocketknight1 Rocketknight1 deleted the terminator_strings_for_generate branch 1 year ago
Rocketknight1

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone