transformers
[BLOOM] Clean modeling code
#18344
Merged

[BLOOM] Clean modeling code #18344

thomasw21 merged 51 commits into main from thomas/bloom_clean_code
thomasw21
HuggingFaceDocBuilderDev
thomasw21
thomasw21 commented on 2022-07-28
thomasw21
thomasw21 commented on 2022-07-28
thomasw21 Cleanup some code
d29881f3
thomasw21 make style
baa5d870
thomasw21 Woops
69227aea
thomasw21 Woops
0078a6cd
thomasw21 Improve signatures
ed09d703
thomasw21 WIP
1a8b80b7
thomasw21 Try to reduce the number of reshape/copies
62adfce1
thomasw21 Woops
ec7442c0
thomasw21 I don't think we actually need the layer_num scaling trick
298e3fde
thomasw21 Woops
42e59544
thomasw21 Woops
96307a4d
thomasw21 No need for duplication
9b2c1ca0
thomasw21 Try to fix beam_search
a8cde02e
thomasw21 Fix beam search
3a095d04
thomasw21 Woops
0ffecbfa
thomasw21 Woops
d40ee96c
thomasw21 Removing layer num normalization seems to be breaking
2677a282
thomasw21 Nit
ddbe33e5
thomasw21 thomasw21 force pushed to ddbe33e5 3 years ago
thomasw21
thomasw21 Not sure self.layer_number normalization actually matters
77f19b37
thomasw21 make style
5ed059c1
thomasw21 Try and be backward compatible
6c3bf962
thomasw21 Woops
995d31a0
thomasw21 Try to fix beam_search
7795e238
thomasw21 Woops
4596be9e
thomasw21 Revert attempt to be backward compatible
47e1969b
thomasw21 Nits
ad1bfe96
thomasw21 Woops
02bf51d0
thomasw21 I don't like kwargs
c02f4093
thomasw21 Woops
47608f85
thomasw21
thomasw21 Improve documentation on past_key_values format
69663c51
thomasw21 make style
b4346e14
thomasw21 thomasw21 marked this pull request as ready for review 3 years ago
thomasw21 thomasw21 marked this pull request as draft 3 years ago
thomasw21 Optimize the device allocation in case of hidden_states in multiple d…
4db82d46
thomasw21 No need to manually cast the values to a specific device
323c0731
sgugger
sgugger commented on 2022-07-29
ydshieh
thomasw21 Rename with long version of variables
2d58b7e3
thomasw21 Improve type hinting
8699509c
thomasw21 Add comment that explains that some methods return views
98fdf998
thomasw21 Make style
1c93638e
thomasw21 Woops
5fcc118a
thomasw21
thomasw21 thomasw21 marked this pull request as ready for review 3 years ago
thomasw21 thomasw21 requested a review from LysandreJik LysandreJik 3 years ago
thomasw21 thomasw21 requested a review from patrickvonplaten patrickvonplaten 3 years ago
thomasw21 thomasw21 requested a review from younesbelkada younesbelkada 3 years ago
thomasw21
thomasw21 commented on 2022-07-29
thomasw21 Actually i think the attention casting only makes sense when we use t…
49aff18f
thomasw21 We don't actually need layer_number to be passed anymore
88814bf4
thomasw21 thomasw21 requested a review from michaelbenayoun michaelbenayoun 3 years ago
thomasw21 Merge remote-tracking branch 'origin/main' into thomas/bloom_clean_code
87861b2e
thomasw21 Fix FX test
a3d50c08
thomasw21 Revert "Fix FX test"
1e4ccaf3
thomasw21 Does this help?
790e2382
thomasw21 Try passing a tuple
b58a6f72
thomasw21 Bypass torch.baddbmm
fd117da1
thomasw21
thomasw21 commented on 2022-07-29
michaelbenayoun
michaelbenayoun approved these changes on 2022-07-29
Muennighoff
Muennighoff commented on 2022-07-29
Muennighoff
Muennighoff commented on 2022-07-29
Muennighoff
Muennighoff commented on 2022-07-29
thomasw21 Apply suggestions from code review
bd1ae60d
thomasw21 Add comment about support for torchScript v1.11
7c399e6a
LysandreJik
thomasw21
thomasw21 Add back layer_number normalization
50e1f2f9
thomasw21 Revert "Add back layer_number normalization"
8f4d6035
thomasw21
sgugger
sgugger approved these changes on 2022-08-01
ydshieh
ydshieh commented on 2022-08-01
NouamaneTazi fix ONNX support for bloom (#18456)
213ec2ca
NouamaneTazi
NouamaneTazi approved these changes on 2022-08-04
thomasw21 thomasw21 merged b69a62d5 into main 3 years ago
thomasw21 thomasw21 deleted the thomas/bloom_clean_code branch 3 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone