llama.cpp
server: benchmark: chat/completions scenario and other llm servers comparison
#5941
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
15
Changes
View On
GitHub
Commits
server: bench: Init a bench scenario with K6
phymbert
committed
1 year ago
server: bench: EOL EOF
phymbert
committed
1 year ago
server: bench: PR feedback and improved k6 script configuration
phymbert
committed
1 year ago
server: bench: remove llamacpp_completions_tokens_seconds as it include prompt processing time and it's misleading
phymbert
committed
1 year ago
server: bench: fix doc
phymbert
committed
1 year ago
server: bench: change gauge custom metrics to trend
phymbert
committed
1 year ago
server: bench: change gauge custom metrics to trend
phymbert
committed
1 year ago
server: bench: doc add an option to debug http request
phymbert
committed
1 year ago
server: bench: filter dataset too short and too long sequences
phymbert
committed
1 year ago
server: bench: allow to filter out conversation in the dataset based on env variable
phymbert
committed
1 year ago
server: bench: fix assistant message sent instead of user message
phymbert
committed
1 year ago
server: bench: fix assistant message sent instead of user message
phymbert
committed
1 year ago
Merge branch 'master' into hp/server/bench/init
ggerganov
committed
1 year ago
server : add defrag thold parameter
ggerganov
committed
1 year ago
server: bench: select prompts based on the current iteration id not randomly to make the bench more reproducible
phymbert
committed
1 year ago
Loading