vllm
[Misc][OpenAI] deprecate max_tokens in favor of new max_completion_tokens field for chat completion endpoint
#9837

Merged

benchmarks
- backend_request_func.py
docs/source/serving
- run_on_sky.rst
examples
- offline_inference_openai.md
- openai_api_client_for_multimodal.py
- openai_example_batch.jsonl
requirements-common.txt
tests
- entrypoints/openai
  - test_audio.py
  - test_chat.py
  - test_vision.py
- tool_use
  - test_chat_completions.py
  - test_parallel_tool_calls.py
  - test_tool_calls.py
vllm/entrypoints/openai
- protocol.py
- serving_engine.py

Loading comments...