llama.cpp
c11d05fe - llama : force disable flash attention for incompatible models

Commit

1 year ago

llama : force disable flash attention for incompatible models

References

#5021 - ggml : add Flash Attention

Author

ggerganov

ggerganov

Parents

Loading