llama.cpp
Generalize `quantize_fns` for simpler FP16 handling
#1237

Merged

Generalize `quantize_fns` for simpler FP16 handling #1237

ggerganov merged 4 commits into ggml-org:master from sw:type-handling

sw force pushed from 508a1814 to 7a633bf9 2 years ago

sw added help wanted

sw added refactoring

sw force pushed from 7a633bf9 to 7d58c842 2 years ago

sw force pushed from 7d58c842 to 86503a59 2 years ago

sw force pushed from 86503a59 to 62c2f377 2 years ago

sw force pushed from 62c2f377 to 8cb93d68 2 years ago

ggerganov added high priority

sw force pushed from 8cb93d68 to 3770f4fc 2 years ago

Generalize quantize_fns for simpler FP16 handling

f9c585f0

sw force pushed from 3770f4fc to f9c585f0 2 years ago

sw marked this pull request as ready for review 2 years ago

sw requested a review from

ggerganov 2 years ago

sw requested a review from

slaren 2 years ago

slaren commented on 2023-07-03

Remove call to ggml_cuda_mul_mat_get_wsize

81f28f25

Merge branch 'master' into HEAD

8e9af803

ggerganov approved these changes on 2023-07-04

ci : disable FMA for mac os actions

745e89ea

ggerganov merged 1b107b85 into master 2 years ago

ggerganov commented on 2023-07-05

sw deleted the type-handling branch 2 years ago

Reviewers

ggerganov

slaren

Assignees

No one assigned

Labels

help wanted high priority refactoring

Milestone

No milestone