llama.cpp
ggml-quants : ternary packing for TriLMs and BitNet b1.58
#8151
Merged

ggml-quants : ternary packing for TriLMs and BitNet b1.58 #8151

compilade merged 33 commits into master from compilade/bitnet-ternary
compilade
compilade ggml-quants : 1.625 bpw ternary packing for BitNet 1.58b
bd807499
compilade ggml-quants : faster 1.625 bpw AVX2 vec_dot
7ef4254a
compilade ggml-quants : substract 1 when back in epi8
48b73b84
compilade ggml-quants : Q2_2 now faster than Q4_K on with AVX2
ef1e345c
compilade ggml-quants : cleanup Q1_3 code formatting
638ad52f
compilade ggml-quants : ARM NEON vec_dot for q2_2 and q1_3
9465ec6e
compilade ggml-quants : use ceiling division when quantizing q1_3
89dc3b25
compilade convert-hf : simplify BitNet pre-quantization
961e2938
compilade convert-hf : allow converting the weird BitNet 1.3B
09961499
compilade compilade added enhancement
compilade compilade added python
compilade compilade added Review Complexity : High
compilade compilade added ggml
compilade compilade added Tensor Encoding Scheme
compilade compilade force pushed from 4522ed78 to 09961499 1 year ago
github-actions github-actions added testing
github-actions github-actions added examples
Eddie-Wang1120
compilade compilade changed the title ggml-quants : 1.625 bpw ternary packing for BitNet 1.58b ggml-quants : 1.625 bpw ternary packing for BitNet b1.58 1 year ago
compilade bitnet : replace 1.58b with b1.58, as in the paper
bfd2f21f
compilade ggml-quants : fix build failure on Windows
ec50944b
compilade
compilade commented on 2024-06-29
compilade ggml-quants : attempt to fix Arm 32-bit support
8fbd5930
Green-Sky
netrunnereve
compilade
ggerganov
ggerganov commented on 2024-07-07
ggerganov
ggerganov commented on 2024-07-07
compilade ggml : add some informative comments in q1_3 vec_dot
dd3e62a7
compilade Merge branch 'master' into compilade/bitnet-ternary
79a278e9
compilade
Green-Sky
mofosyne
ggerganov
compilade
compilade
mofosyne
ggerganov
compilade ggml : add TQ1_0 and TQ2_0 ternary quantization types
77b8f84a
compilade
compilade ggml : even faster TQ2_0
560873f3
compilade ggml : also faster TQ1_0
e9719576
flatsiedatsie
compilade ggml : fix build issues in certain environments
a6dd6994
compilade ggml : add NEON vec_dot implementation for TQ1_0 and TQ2_0
5417089a
compilade ggml : avoid directly using vmlal_high_s8, for 32-bit ARM compat
45719a24
compilade
compilade compilade marked this pull request as draft 1 year ago
compilade ggml : remove q1_3 and q2_2
04eec581
compilade compilade changed the title ggml-quants : 1.625 bpw ternary packing for BitNet b1.58 ggml-quants : ternary packing for TriLMs and BitNet b1.58 1 year ago
Green-Sky
Green-Sky
compilade ggml-quants : rename fields of TQ1_0 and TQ2_0 structs for consistency
f034aa1b
mirek190
compilade
mirek190
Green-Sky
ggerganov
mirek190
Green-Sky
ggerganov
BarfingLemurs
mirek190
compilade ggml-quants : allow using vdotq_s32 in TQ2_0 vec_dot
96b3d411
Hugi-R
compilade Merge branch 'master' into compilade/bitnet-ternary
d911cd1f
compilade gguf-py : Numpy (de)quantization for TQ1_0 and TQ2_0
3a0bf17d
compilade convert : allow direct conversion to TQ1_0 and TQ2_0
895004f3
compilade ggml-quants : allow using ARM dot product instructions for TQ1_0
69f77268
compilade Merge branch 'master' into compilade/bitnet-ternary
82b24040
compilade ggml-quants : deduplicate TQ1_0 and TQ2_0 __ARM_FEATURE_DOTPROD support
35cc5567
compilade compilade marked this pull request as ready for review 1 year ago
ggerganov
ggerganov approved these changes on 2024-08-15
basavyr
flatsiedatsie
sorasoras
compilade
basavyr
flatsiedatsie
compilade Merge branch 'master' into compilade/bitnet-ternary
cb6d9962
compilade
flatsiedatsie
ggerganov
basavyr
flatsiedatsie
compilade Merge branch 'master' into compilade/bitnet-ternary
7f3a619c
compilade ggml ; remove unused ggml_mul special case
8d616076
compilade test-backend-ops : add TQ1_0 and TQ2_0 comments for later
75b3a096
compilade compilade force pushed from e4dc48a5 to 75b3a096 1 year ago
compilade
compilade compilade added merge ready
compilade compilade merged 9bc6db28 into master 1 year ago
WenguoLi
rhjdvsgsgks

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone