[BLOOM] Clean modeling code #18344
Cleanup some code
d29881f3
make style
baa5d870
Woops
69227aea
Woops
0078a6cd
Improve signatures
ed09d703
WIP
1a8b80b7
Try to reduce the number of reshape/copies
62adfce1
Woops
ec7442c0
I don't think we actually need the layer_num scaling trick
298e3fde
Woops
42e59544
Woops
96307a4d
No need for duplication
9b2c1ca0
Try to fix beam_search
a8cde02e
Fix beam search
3a095d04
Woops
0ffecbfa
Woops
d40ee96c
Removing layer num normalization seems to be breaking
2677a282
Nit
ddbe33e5
thomasw21
force pushed
to
ddbe33e5
3 years ago
Not sure self.layer_number normalization actually matters
77f19b37
make style
5ed059c1
Try and be backward compatible
6c3bf962
Woops
995d31a0
Try to fix beam_search
7795e238
Woops
4596be9e
Revert attempt to be backward compatible
47e1969b
Nits
ad1bfe96
Woops
02bf51d0
I don't like kwargs
c02f4093
Woops
47608f85
Improve documentation on past_key_values format
69663c51
make style
b4346e14
thomasw21
marked this pull request as ready for review 3 years ago
thomasw21
marked this pull request as draft 3 years ago
Optimize the device allocation in case of hidden_states in multiple d…
4db82d46
No need to manually cast the values to a specific device
323c0731
Rename with long version of variables
2d58b7e3
Improve type hinting
8699509c
Add comment that explains that some methods return views
98fdf998
Make style
1c93638e
Woops
5fcc118a
thomasw21
marked this pull request as ready for review 3 years ago
Actually i think the attention casting only makes sense when we use t…
49aff18f
We don't actually need layer_number to be passed anymore
88814bf4
Merge remote-tracking branch 'origin/main' into thomas/bloom_clean_code
87861b2e
Fix FX test
a3d50c08
Revert "Fix FX test"
1e4ccaf3
Does this help?
790e2382
Try passing a tuple
b58a6f72
Bypass torch.baddbmm
fd117da1
Apply suggestions from code review
bd1ae60d
Add comment about support for torchScript v1.11
7c399e6a
Add back layer_number normalization
50e1f2f9
Revert "Add back layer_number normalization"
8f4d6035
sgugger
approved these changes
on 2022-08-01
fix ONNX support for bloom (#18456)
213ec2ca
thomasw21
merged
b69a62d5
into main 3 years ago
thomasw21
deleted the thomas/bloom_clean_code branch 3 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub