llama.cpp
3b4bab6a - llama : replace ggml_diag_mask_inf with ggml_add (custom -inf mask)

Commit

2 years ago

llama : replace ggml_diag_mask_inf with ggml_add (custom -inf mask)

References

#3228 - llama : custom attention mask + parallel decoding + no context swaps

#3234 - llama : store non-RoPEd K cache

Author

ggerganov

ggerganov

Parents

Loading