llama.cpp
CUDA: Implemented row flattening for non-glm RoPE
#2468
Merged

Loading