onnxruntime
Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM.
#18619
Merged

Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. #18619

chenfucn merged 13 commits into microsoft:main from chenfucn:cfu_kernel
chenfucn
snnn
snnn commented on 2023-11-30
snnn
snnn commented on 2023-11-30
yufenglee
yufenglee commented on 2024-01-08
yufenglee
yufenglee commented on 2024-01-08
yufenglee
yufenglee commented on 2024-01-09
yufenglee
yufenglee commented on 2024-01-09
yufenglee
yufenglee commented on 2024-01-09
chenfucn adding cuda kernel with tests
99075996
chenfucn add compilation flag
7ca652c3
chenfucn require cuda 11.4 for cutlass
93ac7e33
chenfucn fix comments and rebase on main
cf397577
chenfucn chenfucn force pushed from 9c92e1ac to cf397577 2 years ago
github-advanced-security
github-advanced-security commented on 2024-01-26
chenfucn refactor blkq4 gemm quant input generation
73679d3f
github-advanced-security
github-advanced-security commented on 2024-01-30
chenfucn lint
423aa1fe
chenfucn chenfucn force pushed from 34adf5d5 to 423aa1fe 2 years ago
chenfucn conflict with main
40de1a14
chenfucn remove redundent test function
2d67beaa
chenfucn chenfucn force pushed from efe36430 to 2d67beaa 2 years ago
yufenglee
yufenglee commented on 2024-02-08
yufenglee
yufenglee commented on 2024-02-08
yufenglee
yufenglee commented on 2024-02-08
yufenglee
yufenglee commented on 2024-02-08
yufenglee
yufenglee commented on 2024-02-12
chenfucn fix mis-spell and comments
18bf4636
yufenglee
yufenglee commented on 2024-02-15
yufenglee
yufenglee commented on 2024-02-16
yufenglee
yufenglee commented on 2024-02-16
yufenglee
yufenglee commented on 2024-02-20
yufenglee
yufenglee commented on 2024-02-20
yufenglee
yufenglee commented on 2024-02-20
yufenglee
yufenglee commented on 2024-02-20
yufenglee
yufenglee commented on 2024-02-20
chenfucn variable and type names
7d5d5ca4
chenfucn ptx for row blocking no zero-point
b9f9cb76
chenfucn optimize column block dequant
31a602f4
chenfucn lint
1477c011
yufenglee
yufenglee approved these changes on 2024-03-05
chenfucn chenfucn merged 06e684c9 into main 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone