llama.cpp
server : allow using LoRA adapters per-request
#10994
Merged

server : allow using LoRA adapters per-request #10994

ngxson merged 12 commits into ggml-org:master from ngxson:xsn/lora_per_request
ngxson
ngxson slot.can_batch_with
2ba6efc5
ngxson lora per request
9d84127f
github-actions github-actions added examples
github-actions github-actions added python
github-actions github-actions added server
ngxson test: force disable cache prompt
9947b077
ngxson move can_batch_with check
b9b2b637
ngxson fix condition
076346db
ngxson Merge branch 'master' into xsn/lora_per_request
d67fefb9
ngxson add slow test with llama 8b
367f0ab1
ngxson update docs
bf7df957
ngxson move lora change task to queue
1dbd16ab
ngxson ngxson marked this pull request as ready for review 1 year ago
ngxson ngxson requested a review from ggerganov ggerganov 1 year ago
ggerganov
ggerganov approved these changes on 2025-01-02
ngxson Apply suggestions from code review
a90e0642
ngxson lora_base
9274a6bc
ngxson remove redundant check
74e460d5
ngxson ngxson merged 0da5d860 into master 1 year ago
Ujjawal-K-Panchal

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone