vllm
[Kernel] W8A16 Int8 inside FusedMoE
#7415
Merged

[Kernel] W8A16 Int8 inside FusedMoE #7415

mzusman
github-actions
mzusman
github-actions github-actions added ready
mzusman mzusman changed the title [Kernel] W8A16 Int8 MoE [Kernel] W8A16 Int8 inside FusedMoE 1 year ago
halexan
robertgshaw2-redhat
mzusman
mzusman
mzusman mzusman force pushed 1 year ago
jeejeelee
mzusman Add experts int8 config
6b834a37
mzusman Add support in fusedmoe
afddd3b1
mzusman Add experts int8 to quantization list
289367ab
mzusman Remove logger
084405e1
mzusman Add to optimized quantization
0c690fe5
mzusman Format
31004906
mzusman Add startup test for experts_int8
413400cc
mzusman Typo
9e7bc79f
mzusman Add test
1ebb5d7e
mzusman Change compute capabiltiy to 80
44a72d6b
mzusman Format
39660caf
mzusman Disable for CPU
a097b6e3
mzusman Add use_int8 to the moe benchmarks
c12635cb
mzusman mzusman force pushed to c12635cb 1 year ago
mzusman Use JambaMoE to implement MLP
9436034c
mzusman Use MoE to implement MLP
4b712e44
mzusman Format
3b6967e4
halexan
mzusman Fix
5f5b11e2
qingquansong
mgoin
mgoin commented on 2024-08-15
mgoin
mgoin commented on 2024-08-15
mzusman Move experts_int8 to quantizatiob subdir and add is quant method
e199b177
mzusman Split if else in benchmark moe
9c47ad0f
mzusman Rename use_int8 to use_int8_w8a16, use_fp8 to use_fp_w8a8
97f0585f
mzusman Reverse order
00254591
mzusman Change dtype in configs filename
a1d75cb9
mzusman Single function to get dtype config name
505e3d34
mzusman Align experts int8 apply with fp8
80d977c1
mzusman Align with upstream
1c403be5
mzusman Format
744ecd4b
mgoin
mgoin commented on 2024-08-15
mzusman
mzusman Change fp8 to fp8_w8a8
a5bf0b34
mzusman Correct the args
1c7e6899
mzusman Remove experts int8 from ignore cpu
e438b84e
mzusman Fix typo
c23a2f46
mzusman Fix Jamba tests since MLP layer is not aligned with HF
7e619c7d
mzusman Merge remote-tracking branch 'github/main' into expert_int8_upstream
70a65983
mzusman
dsikka
dsikka commented on 2024-08-16
mgoin
mgoin approved these changes on 2024-08-16
mzusman
mzusman Merge remote-tracking branch 'github/main' into expert_int8_upstream
4d6c546e
simon-mo simon-mo merged 7fc23be8 into main 1 year ago
AllenDou
AllenDou commented on 2025-02-14

Login to write a write a comment.

Login via GitHub

Reviewers
Labels
Milestone