llama.cpp
falcon : fix CUDA inference by making K and Q contiguous
#2830
Merged

falcon : fix CUDA inference by making K and Q contiguous #2830

ggerganov merged 2 commits into master from fix-falcon
ggerganov
ggerganov falcon : fix CUDA inference by making K and Q contiguous
7c55447f
ggerganov cuda : add assert to guard from non-cont ropes
cc924c57
ggerganov ggerganov requested a review from JohannesGaessler JohannesGaessler 2 years ago
JohannesGaessler
JohannesGaessler approved these changes on 2023-08-27
ggerganov ggerganov merged eaa13a48 into master 2 years ago
ggerganov ggerganov deleted the fix-falcon branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone