llama.cpp
common : reimplement logging
#9418
Merged

common : reimplement logging #9418

ggerganov merged 8 commits into master from gg/log
ggerganov
ggerganov306 days ago (edited 302 days ago)

ref #8566

Merge ETA: Sep 15

Overhaul common/log.h. The main goal is to offload print IO to a separate thread in order to not affect the performance of the examples. Also add convenience options for timestamps, colors based on the log type and output to file.

By default, the logs should look the same as they do on master.

Adding the following options will make them look like this:

# set once in your env
export LLAMA_LOG_COLORS=1
export LLAMA_LOG_PREFIX=1
export LLAMA_LOG_TIMESTAMPS=1

# or pass CLI args
./llama-cli ... --log-colors --log-prefix --log-timestamps

image

Another important change is that the logs in llama-server have been significantly reformatted. I've always had trouble reading the output, so I changed the text in a way that is IMO easier to read. Also removed the option to output json logs as I don't think it has any practical value.

github-actions github-actions added examples
github-actions github-actions added ggml
github-actions github-actions added testing
ggerganov ggerganov force pushed from c1e3c16c to d206f876 305 days ago
github-actions github-actions added server
ggerganov ggerganov force pushed from 5a0d1642 to f0299474 305 days ago
github-actions github-actions added python
github-actions github-actions added devops
ggerganov ggerganov force pushed from e7560abb to 2dff6f4d 304 days ago
ggerganov common : reimplement the logger
2948768e
ggerganov ggerganov force pushed from 51dbf513 to 2948768e 303 days ago
ggerganov log : print if build is debug [no ci]
078be074
ggerganov examples : move gpt_init() after parsing the cli args
2afe0a0c
ggerganov log : add comments + adjust defaults
8f84210d
ggerganov ggerganov force pushed from dfd47383 to 8f84210d 303 days ago
ggerganov server : improve log format
0d0dc111
ggerganov server : fix verbose check
ff3b3809
ggerganov log : option to disable the log prefix
13226dc8
ggerganov ggerganov force pushed from 1b87cd80 to 13226dc8 303 days ago
ggerganov ggerganov marked this pull request as ready for review 303 days ago
ggerganov log : cleanup, comments, build flags
40638f71
ggerganov ggerganov force pushed from e1ad8fdd to 40638f71 303 days ago
bviksoe
bviksoe303 days ago (edited 303 days ago)

Please don't take away the optional JSON formatting for logs.
This has been an important step to automate sending warnings and errors to the front-end when hosting the server process. Parsing the often very unstructured text output is tedious and error prone, while having the structured format like JSON makes it a whizz. Some logs can even function as progress notifications for a front-end, such as LOG_INFO("model loaded")

ggerganov
ggerganov302 days ago

@bviksoe The server logs were never meant to be used in such way. These messages can be disabled and changed (even when in JSON format) without notice and therefore 3rd-party code should never rely on them. Instead, your frontend can query the server through the available endpoints. If you have a specific functionality in mind that is currently missing, submit a feature request and it will get implemented. Model loading and server status can already be queried properly through the existing API.

ggerganov ggerganov added merge ready
bviksoe
bviksoe302 days ago

Thanks. I understand your desire to clean up and streamline the various parts of the project.
My argument is specifically actual error and warning messages produces during loading and even during streaming inference.
In server there are many such messages that are conveyed only in log. An example: LOG_ERROR("failed to get embeddings"...) If you want to create a user-friendly front-end, these should be accessible to the website/api user.

ggerganov
ggerganov302 days ago

If you provide sample curl requests, combined with the output that you would expect server to return for each of them, we can extend the API and the responses. Feel free to open a feature request and list the missing functionality.

ggerganov ggerganov merged 6262d13e into master 301 days ago
ggerganov ggerganov deleted the gg/log branch 301 days ago
kyx0r
kyx0r295 days ago

Prior to this, it was possible to use --log-disable to stop llama-cli from printing all model loading debug logs. It would only print the prompt loaded with -p flag and of course model output + user input when in interactive mode. Now this is broken, it doesn't print the prompt and the model output. As far as I can see the verbosity options are not implemented completely yet to be able to facilitate the old behavior.

I hope with gets fixed in the future, as right now it only works if --log-disable is removed. But that will make it dump everything instead of just the prompt. Also, --log-enable was removed, which was useful too, but my guess is once the verbosity level settings work as expected this will cover that aspect.

ggerganov
ggerganov293 days ago

All messages with non-zero log level are output to stderr, so you can redirect stderr to /dev/null to get the old behavior.

kyx0r
kyx0r293 days ago👍 1

Hi Georgi,
Understood, I did not check if it was still using stderr or not. It just wasn't obvious at a glance. I suppose users can continue using the stderr redirection hack for the time being. My hope that in the future though, more logging verbose levels get implemented so that it is more obvious and easier to control for the end users.

Also, even if you redirect to stderr, it doesn't remove everything. For example, you still have this message:
"== Running in interactive mode. =="
At least it's not too bad to just edit the source code and change the logging function it uses.

Kind Regards,
Kyryl

ericcurtin
ericcurtin291 days ago

@ggerganov , I noticed the same thing. --log-disable now breaks conversation mode

ggerganov
ggerganov291 days ago

Don't use --log-disable. Instead redirect stderr to /dev/null if you don't need it.

rhatdan
rhatdan244 days ago👍 1

The problem with redirecting stderr to /dev/null, is when a real error happens the user does not see it, for example if the model does not exists or not readable because of a permissions error.

Djip007
Djip007 commented on 2024-12-04
examples/perplexity/perplexity.cpp
18221819 }
1823 fprintf(stderr, "%.2f minutes\n", total_seconds / 60.0);
1824
1825 printf("\nchunk PPL ln(PPL(Q)/PPL(base)) KL Divergence Δp RMS Same top p\n");
1820 LOG("%.2f minutes\n", total_seconds / 60.0);
18261821 }
1822
LOG("\n");
1823
LOG("chunk PPL ln(PPL(Q)/PPL(base)) KL Divergence Δp RMS Same top p\n");
Djip007220 days ago

why write it on every line and not as a header?

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone