llama.cpp
server : parallel decoding and multimodal (cont)
#3677
Merged

server : parallel decoding and multimodal (cont) #3677

ggerganov merged 72 commits into master from server-rev
ggerganov
FSSRepo implementing parallel decoding in server example
63f99b1e
FSSRepo crash fixed
47123020
FSSRepo save dev progress
78504218
FSSRepo Merge branch 'master' of https://github.com/ggerganov/llama.cpp
b716eeb7
FSSRepo refactored sampling function
29c8cdd6
FSSRepo completion endpoint working
81484805
FSSRepo multiple client support
5b8e29de
FSSRepo grammar + no stream completion
83c2b355
FSSRepo cached prompt support
500ac712
FSSRepo chat.mjs support cached prompt + some fixes
4ba5a501
FSSRepo server ui now support multiple clients
6358ae5f
FSSRepo unused change reverted
a410a9e3
FSSRepo fixed timings per slot
b6d9e212
FSSRepo add context swap
a2c2d98c
FSSRepo add changes to README.md
eb082012
FSSRepo llava multimodal integration
9d98cdda
FSSRepo fixed tokens probs
de35b479
FSSRepo add multimodal input - alfa
9f72b446
FSSRepo refactor code + remove unused comments + improved README.md
7e64bfe0
damian0815 fix compilation errors with llvm
299f6b54
FSSRepo notify the user from server ui that multimodality is unavialable
4e5c5c45
FSSRepo Merge branch 'ggerganov:master' into master
f47fd17b
FSSRepo Merge pull request #6 from damian0815/fssrepo_mac_fixes
9035978a
FSSRepo some ci fixes
ce961a30
FSSRepo fix ci make build undefined ref errors
b727e022
FSSRepo fix long prompt than ctx proposed in #3639
fd64f04f
FSSRepo fixed premature end due stop word
2d9f11db
FSSRepo context shift fixed
d7eca255
FSSRepo fix llava implementation
4d180433
FSSRepo sync README.md changes
aa2268f4
FSSRepo Merge remote-tracking branch 'upstream/master'
fa0f22f1
FSSRepo readme change
58f8ae9b
FSSRepo update api like OpenAI
6c277eaa
FSSRepo multimodal support enabled by default
ed0c11cb
FSSRepo fix make bui;d errors
d2b1fac6
FSSRepo fix multiple clients
c02c52ef
FSSRepo fix zig build
35fd3743
FSSRepo Merge branch 'ggerganov:master' into master
84b8f2b0
FSSRepo new sampling API
7196c4e0
FSSRepo Merge branch 'master' of https://github.com/ggerganov/llama.cpp
8540568c
FSSRepo latest changes of sampling API
ab2fc002
ggerganov server : coding-style normalization
e44ed601
ggerganov server : coding-style normalization (part 2)
654e0a1f
ggerganov server : remove beam-search functionality
a8c981b7
FSSRepo
monatis
ggerganov server : bug fix in ingest_images
3d5929e8
ggerganov server : use refs + use llama_batch_clear()
e3a2c3fe
ggerganov server : snake case
9740824b
ggerganov
ggerganov server : minor sync
325d1793
FSSRepo
ggerganov
monatis
FSSRepo
monatis
FSSRepo
ggerganov
FSSRepo
FSSRepo added thread safe pipeline
6b2437e3
FSSRepo
ggerganov server : bach has to be allocated for n_parallel sequences
113dd600
ggerganov server : no need for atomic int - already using mutex
5d540e80
ggerganov server : logs + minor code style
778c070d
ggerganov
ggerganov ggerganov added need feedback
ggerganov
monatis
jxy
jhen0409 server : fix multibyte handle in partial response (#3706)
17b23eb9
FSSRepo
monatis
ggerganov
Green-Sky
Green-Sky commented on 2023-10-21
FSSRepo
ggerganov
FSSRepo fix image load + view image in chat
2eb4c11e
monatis
ggerganov
monatis
monatis approved these changes on 2023-10-22
ggerganov Merge branch 'master' into server-rev
176993c8
ggerganov
FSSRepo
ggerganov make : silence stb warnings
4b4ab722
ggerganov clip : link to ggml, not to llama
715f384a
ggerganov server : fix switch fallthrough
197a0a9e
ggerganov server : fix crash in Debug on macOS (I have no idea why this fixes i…
ef18f4d5
ggerganov server : refactor ctx_sampling init + n_ctx + names
569ebf11
monatis
monatis commented on 2023-10-22
ggerganov server : bug fix for prompt caching
f67d9713
monatis Do not save/load image_data to localStorage
5359fb92
ggerganov editorconfig : new line in index.html
f305d643
ggerganov server : completion requests remember slot_id
a8063171
ggerganov
FSSRepo
monatis Update readme to document multimodal in server
2679c432
monatis Merge branch 'server-rev' of https://github.com//ggerganov/llama.cpp …
a4d69d8b
ggerganov server : minor style
dd1af2ed
ggerganov
FSSRepo
monatis Update readme to document multimodal in server
3d6a687f
ggerganov server : hide ctx_sampling->prev behind API (#3696)
00ae55b3
ggerganov
FSSRepo
FSSRepo
monatis
monatis commented on 2023-10-22
ggerganov
ggerganov server : apply fix from #3722
8fe7ca48
FSSRepo
ggerganov server : fix slot reuse
83e14901
ggerganov
ggerganov
cebtenzzre
cebtenzzre commented on 2023-10-22
FSSRepo
monatis
monatis commented on 2023-10-22
ggerganov server : add comment about changing slot_state to bool
c0f4d548
FSSRepo
monatis
ggerganov ggerganov merged 438c2ca8 into master 2 years ago
cebtenzzre
ggerganov
ibehnam
ibehnam
ggerganov
ibehnam

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone