llama.cpp
3d3e6bd0 - llama : offload for rest of the model arches

Commit

2 years ago

llama : offload for rest of the model arches

References

#4309 - llama : per-layer KV cache

Author

ggerganov

ggerganov

Parents

Loading