llama.cpp
server: benchmark: chat/completions scenario and other llm servers comparison
#5941
Merged

Commits
  • server: bench: Init a bench scenario with K6
    phymbert committed 1 year ago
  • server: bench: EOL EOF
    phymbert committed 1 year ago
  • server: bench: PR feedback and improved k6 script configuration
    phymbert committed 1 year ago
  • server: bench: remove llamacpp_completions_tokens_seconds as it include prompt processing time and it's misleading
    phymbert committed 1 year ago
  • server: bench: fix doc
    phymbert committed 1 year ago
  • server: bench: change gauge custom metrics to trend
    phymbert committed 1 year ago
  • server: bench: change gauge custom metrics to trend
    phymbert committed 1 year ago
  • server: bench: doc add an option to debug http request
    phymbert committed 1 year ago
  • server: bench: filter dataset too short and too long sequences
    phymbert committed 1 year ago
  • server: bench: allow to filter out conversation in the dataset based on env variable
    phymbert committed 1 year ago
  • server: bench: fix assistant message sent instead of user message
    phymbert committed 1 year ago
  • server: bench: fix assistant message sent instead of user message
    phymbert committed 1 year ago
  • Merge branch 'master' into hp/server/bench/init
    ggerganov committed 1 year ago
  • server : add defrag thold parameter
    ggerganov committed 1 year ago
  • server: bench: select prompts based on the current iteration id not randomly to make the bench more reproducible
    phymbert committed 1 year ago
Loading