Cpu fused kernel #1804

jiqing-feng
jiqing-feng add template to support more dtypes
6be14123
jiqing-feng update cmake list
252ac0f8
jiqing-feng fix typo
f98c9e5d
jiqing-feng fix compile cpu
902bf359
jiqing-feng make different dtype works
fef8459f
jiqing-feng use bf16 on CPU
55cbaa0d
jiqing-feng fix state2 dtype
bbef95b3
jiqing-feng remove torch
e8425135
jiqing-feng rm torch
d4473fa9
jiqing-feng enable float to bf16
dea8dd63
jiqing-feng rm dequantizeBlockwise4bitCpu
e9bb4fe1
jiqing-feng fix check
cdc8d5e0
jiqing-feng enable dequant 4bit kernel
baacfac2
jiqing-feng fix typo
eec35212
jiqing-feng fix typo
d7cc1c5e
jiqing-feng fix dequantize
124b754e
jiqing-feng fix
0f918c72
jiqing-feng fix
e1a8b20d
jiqing-feng test
eab45c85
jiqing-feng fix
d9f5dd8e
jiqing-feng fix
070f8a08
jiqing-feng fix
a84addfe
jiqing-feng fix
c4bb6607
jiqing-feng fix
4ba13fd3
jiqing-feng change input param
c0d05ec1
jiqing-feng fix typo
62a16a6e
jiqing-feng fix input param
d9ad8282
jiqing-feng spliut 8bit and 4bit
09ed6cbf
jiqing-feng fix typo
a3f7b611
jiqing-feng fix typo
47084701
jiqing-feng fix input params
1dfe9f71
jiqing-feng fix input params
00289c42
jiqing-feng fix
a2578baa
jiqing-feng fix typo
72033dc1
jiqing-feng enable dequant4bit
1c20ae83
jiqing-feng fix
7552fe22
jiqing-feng fix
8b32a39c
jiqing-feng fix reverse
8f1cc369
jiqing-feng fix dequant 4bit fallback path
49d242a8
jiqing-feng fix fp4 dequant
4a9a6dc1
jiqing-feng Merge branch 'main' into cpu_kernel
6bcd19e3
jiqing-feng rm _Float16
d7e981d9
jiqing-feng tmp codes
48739b09
jiqing-feng enable gemv
f784be86
jiqing-feng change to 4bit dequant
92192c9f
jiqing-feng fix def
bd02e712
jiqing-feng fix type
85200691
jiqing-feng fix absmax dtype
e921cbb5
jiqing-feng fix type
9b5d97a3
jiqing-feng fix compile and type
fd6cff13
jiqing-feng enable gemv
46d6e47a
jiqing-feng fix shape
3271c308
jiqing-feng fix lib name
176a2b61
jiqing-feng debug
196984a7
jiqing-feng update
76521152
jiqing-feng enable gemv 4bit bf16
ea0e6497
jiqing-feng enable avx512 check
9277d24d
jiqing-feng fix check
4fb315bc
jiqing-feng fix endif
81f19844
jiqing-feng fix format
0f78bada
jiqing-feng fix format
fcb84565
jiqing-feng fix def
c5e18945
jiqing-feng jiqing-feng marked this pull request as ready for review 33 days ago
jiqing-feng rebase
f2029c6e
jiqing-feng fix position
df1d669a
jiqing-feng fix format
bb3ac8da
jiqing-feng rm duplicated func
26b56852
matthewdouglas matthewdouglas added x64 CPU
github-actions
jiqing-feng Merge branch 'main' into cpu_fused_kernel
445725b3
jiqing-feng
jiqing-feng rm useless code comments
580010cc
matthewdouglas
matthewdouglas commented on 2025-11-19
matthewdouglas
matthewdouglas commented on 2025-11-19
matthewdouglas
matthewdouglas commented on 2025-11-19
matthewdouglas
matthewdouglas commented on 2025-11-19
jiqing-feng fix out shape
57b89bfa
jiqing-feng Merge branch 'main' into cpu_fused_kernel
302a5fe3
matthewdouglas
matthewdouglas commented on 2025-11-19
SunMarc
SunMarc commented on 2025-11-19
jiqing-feng
jiqing-feng fix comments
de5fb9c9
jiqing-feng add reverse format
6858a90b
jiqing-feng check avx512bf15
3b3d609b
jiqing-feng fix has_avx512bf16
fbb911b6
jiqing-feng fix tests
3179b42b
jiqing-feng fix absmax shhape
0c88d436
jiqing-feng fix compile
feb8ad22
jiqing-feng fix tests
c6b714d8
jiqing-feng fix test_gemv
54971118
matthewdouglas
jiqing-feng
jiqing-feng Merge branch 'main' into cpu_fused_kernel
0045c4b0
jiqing-feng
jiqing-feng jiqing-feng force pushed from d2de0f5c to 0045c4b0 22 days ago
jiqing-feng disable binsearch
bdb25c04
jiqing-feng fix lint
6cec12dc
matthewdouglas matthewdouglas added this to the v0.49.0 milestone 21 days ago
matthewdouglas
matthewdouglas commented on 2025-11-25
matthewdouglas
matthewdouglas commented on 2025-11-25
jiqing-feng
jiqing-feng fix save
692a8e15
matthewdouglas
matthewdouglas matthewdouglas merged 6aa96193 into main 20 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone