llama.cpp
falcon : fix CUDA inference by making K and Q contiguous
#2830

Merged

falcon : fix CUDA inference by making K and Q contiguous #2830

ggerganov merged 2 commits into master from fix-falcon

falcon : fix CUDA inference by making K and Q contiguous

7c55447f

cuda : add assert to guard from non-cont ropes

cc924c57

ggerganov requested a review from

JohannesGaessler 2 years ago

JohannesGaessler approved these changes on 2023-08-27

ggerganov merged eaa13a48 into master 2 years ago

ggerganov deleted the fix-falcon branch 2 years ago

Reviewers

JohannesGaessler

Assignees

No one assigned

Labels

None yet

Milestone

No milestone