llama.cpp
daf4c6d3 - llama : fix worst case graph build

Commit

1 year ago

llama : fix worst case graph build

References

#3228 - llama : custom attention mask + parallel decoding + no context swaps

Author

ggerganov

ggerganov

Parents

Files3

common
- common.cpp
examples/llama-bench
- llama-bench.cpp
llama.cpp