transformers
GgufLinear: inference-time GGUF matmul on Apple Silicon — llama.cpp parity
#45977
Open

Loading