llama.cpp
061f5f8d - CUDA: add restrict to mul mat vec kernels (#2140)

Commit

2 years ago

CUDA: add __restrict__ to mul mat vec kernels (#2140)

References

#2140 - 5.5x more CUDA performance with 5 minutes of work

Author

JohannesGaessler

JohannesGaessler

Parents

Loading