waiting for merge
waiting for merge
waiting for merge
I have opened an issue 9066 where I experienced a crash after this pull request was merged. The crash was unrelated to this miniCPM-V-2.6 model. I hope you can reproduce the error
I have opened an issue 9066 where I experienced a crash after this pull request was merged. The crash was unrelated to this miniCPM-V-2.6 model. I hope you can reproduce the error
Hello, I saw that the issue you mentioned was that llava would crash, but my update only involves the part of minicpmv. Although I am not sure about the issue problem, I feel that it may not be the problem with this branch.
Can you test whether this branch will also crash before being merged? Of course, if it is indeed a problem introduced by this PR, I will be very happy to help modify it.
@tc-mb Can we use mini cpm with context cache ? So that we upload image once and ask for multiple question referring to the same image ?
@tc-mb Can we use mini cpm with context cache ? So that we upload image once and ask for multiple question referring to the same image ?
Yes, it's now storing cache.
You can run in interactive mode to ask multiple rounds of questions.
./llama-minicpmv-cli -m ../MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-V-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -i
or modify the minicpmv-cli function (which is more like an example) to achieve the functionality you want.
Eagerly awaiting...
165 | 587 | fname_middle = "mmproj-" | |
166 | 588 | has_text_encoder = False | |
167 | 589 | has_minicpmv_projector = True | |
590 | minicpmv_version = 3 |
Is this line necessary? It overrides minicpmv_version
value set in the command line when converting MiniCPM-V2.5 which results in a broken mmproj-model-f16.gguf.
@tc-mb Can we use mini cpm with context cache ? So that we upload image once and ask for multiple question referring to the same image ?
Yes, it's now storing cache.
You can run in interactive mode to ask multiple rounds of questions.
./llama-minicpmv-cli -m ../MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-V-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -i
or modify the minicpmv-cli function (which is more like an example) to achieve the functionality you want.
cool, thats a great feature, thanks @tc-mb
Very cool! Are GPU operations supported at this time?
Very cool! Are GPU operations supported at this time?
I have tested in Ubuntu + Nvidia(4090), it is available and speed looks good. You can use it in the following way.
make LLAMA_CUDA=1
And add appropriate ngl parameters, such as.
./llama-minicpmv-cli -m ../MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-V-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?" -ngl 100
Awesome, thanks!
@tc-mb
Can you give us the usage for how to serve up minicpm2.6 using llama-server so we can send it openai compatible chat completion requests with base64 encoded images
@tc-mb Can you give us the usage for how to serve up minicpm2.6 using llama-server so we can send it openai compatible chat completion requests with base64 encoded images
Sorry, I didn't test the server method when I updated it, I will support this capability in the near future.
@tc-mb Could you please provide the templating info in README-minicpmv2.6.md? Like the llava-cli templating and llava-1.6 prompting section in README-minicpmv2.6.md
. It is necessary for practical usage to know how to organize the user question and the image. And also, whether or not the image should be converted to bytes or base64? Thanks!
Login to write a write a comment.
Dear llama.cpp Official,
Hi, I'm writing to address our new PR submission for integrating our model MiniCPM-V 2.6 into llama.cpp. MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series. This model is stronger and supports multi-images understanding and video understanding.
This version of the model supports video understanding, and I have implemented functions such as video frame extraction in my fork version. However, because ffmpeg is introduced, there may be many environment and compilation issues in other devices. Therefore, I think it can be divided into multiple PR submissions.
Best regards,
MiniCPM-V Official ^_^