Megatron-DeepSpeed
Implement rotary embeddings
#7
Merged

Implement rotary embeddings #7

thomasw21
thomasw21 thomasw21 changed the title WIP: Implement rotary embeddings Implement rotary embeddings 4 years ago
thomasw21 thomasw21 requested a review from TevenLeScao TevenLeScao 4 years ago
thomasw21 thomasw21 marked this pull request as ready for review 4 years ago
TevenLeScao
TevenLeScao requested changes on 2021-07-23
thomasw21 thomasw21 requested a review from TevenLeScao TevenLeScao 4 years ago
Integrate EleutherAI's version of rotary embeddings + make some small…
e8d4d1cb
Add argument parser for position embeddings
7844641f
Making max-absolute-position-embeddings optional
836d0440
Move enum outside model
c9523ead
Handle max_seq_len_cached better
215a38a4
Fix dtype issue in rotary embeddings
0bd2138a
Fix tensor size
a69c75d3
Replace hidden_dim by hidden_size_per_attention_head
fbae8b97
Change all examples to new format and improve help in argparser
6bb1a333
Revert back changes, add comparison with position embedding type when…
14815564
Revert back changes:
0528e39c
Reformat
605f5856
thomasw21 thomasw21 force pushed from 193180e9 to 605f5856 4 years ago
Rm run.sh~ and modify back run.sh
99be67e1
thomasw21 thomasw21 force pushed from 7b1b5ba2 to 99be67e1 4 years ago
TevenLeScao
TevenLeScao commented on 2021-07-27
TevenLeScao
TevenLeScao approved these changes on 2021-07-27
thomasw21 thomasw21 merged dc4e0cba into main 4 years ago
stas00

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone