ggml
Add parallel decoding in GPT2 example
#572
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
12
Changes
View On
GitHub
Add parallel decoding in GPT2 example
#572
ggerganov
merged 12 commits into
master
from
feature/parallel-decoding-gpt2-example
Initial attempt to make gpt2 do parallel decoding
ce6139c4
Fix crash on trying to use empty embd
761db297
Make it work for n_parallel=1
38a17443
Add short way of passing n_parallel argument
845f39c7
Move gpt-2 batched to a separate target and cpp file
42db4049
YavorGIvanov
marked this pull request as ready for review
2 years ago
YavorGIvanov
requested a review
from
ggerganov
2 years ago
YavorGIvanov
removed review request
from
ggerganov
2 years ago
YavorGIvanov
requested a review
from
slaren
2 years ago
YavorGIvanov
requested a review
from
ggerganov
2 years ago
Add batched sample output to README and remove hardcoded model path a…
5ffcbf44
gpt-2-batched : fix n_kv heuristic
d91540a9
Free batch at end of example
af6a1d94
gpt-2-batched : simplify kv cache stuff (#574)
898718c0
Fix not generating n_predict tokens and fix warn
993d226f
minor : readme
63ab3d61
Add check for end token and mark the stream as finished
c2058752
ggerganov
approved these changes on 2023-10-12
ggerganov
merged
8e828325
into master
2 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
slaren
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub