llama.cpp
7e2b9974 - ggml-cuda : update rope implementation for parallel decoding (#3254)

Commit

2 years ago

ggml-cuda : update rope implementation for parallel decoding (#3254) * ggml-cuda : update rope implementation for parallel decoding * better solution for p0 computation * fix rope * simpler rope implementation --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

References

#3254 - ggml-cuda : update rope implementation for parallel decoding

Author

slaren

Parents

daf4c6d3

llama.cpp 7e2b9974 - ggml-cuda : update rope implementation for parallel decoding (#3254)

llama.cpp
7e2b9974 - ggml-cuda : update rope implementation for parallel decoding (#3254)