Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
intel/auto-round
Pull Requests
Commits
copilot/sub-pr-1324
AutoAdamRound_bugfix
Chinesization
ZaneMark-patch-1
actvation_quant
add_task_args_for_lmeval
ar_agent
ark_zp
autoround_support_qbits_backend
bf16_scale
chore/claude-init
copilot/fix-corner-case-in-auto-round
copilot/fix-deprecated-fp-layers-handling
copilot/fix-docstrings-in-python-files
copilot/fix-issue-with-auto-rounding
copilot/fix-llm-type-70b-bits-setting
copilot/fix-typeerror-wrapped-fn
copilot/replace-getset-module-torch-api
copilot/sageattention
copilot/speedup-fp8-linear-convert
copilot/speedup-fp8-linear-convert-again
copilot/speedup-fp8-linear-convert-another-one
copilot/sub-pr-1237-again
copilot/sub-pr-1237
copilot/sub-pr-1324
copilot/sub-pr-1522-again
copilot/sub-pr-1532
copilot/update-user-settings-page
ddp
debug_time_cost
debug-nvfp4
deepseekv3
ds-qwen
ds-v5
ds-v32
enable_glm4_moe_lite_quantization
enable_llama4_int8_baseline
enable_llama4_quant
enable_mxfp_exporting
feat/activation-checkpointing
fix_bug0627
fix_bug_0722
fix_bug_1105
fix_disable_act_dynamic_usage_in_mxfp.py
fix_dq
fix/fp-layers-deprecation-mapping
fix_gemma3_issue
fix_gguf_fp8
fix_gptqmodel
fix_low_cpu
fix_save_quantized_func_nvfp_checker
fix_0107
fix_0109
fix_0113
fix-attn-mask-b60
fix-ds
fix-flashinfer
fix-gpt-oss
fix-hpu
fix-to-meta-assertion-error-1499
fixbug_0717
fp4_v2
fp4_v3
fp8-cache
fp8-cache-based-export
fp8-static-quant-patch
fp8_export_backup_stable
fp8_export_for_test
good-flux
hadamard_change
hengguo/fix_cuda_ut
hengguo/fix_gguf_ds
hengguo/new_ar_arch
hengguo/quantizers
hengguo/refactor_init
hengguo/refactor_quant_step1
hengguo/smoothquant
hengguo/support_for_gemma4
hengguo/w4afp8_sim
henguo/update_so
hpu_only_kg
hpu_only_pkg
hpu/only/v1
kaihui/low_cpu_mem_usage
kaihui/torch_dtype
lazy-model-replace
leq_opub
lib/pre-4.4.0
llama/new/9-610
llama/new/9
llm-main
llmc
llmc-backup
llmc-test
lm-head-quant
load-kv
load-w8a8-replace-mod
load-w8a8
lvl/cpu_ram_optimization
lvl/fix_no_init_weights
lvl/general_moe_replacement
lvl/support_bagel_mot
lvl/support_fp8_with_ark
lvl/support_ovis_image
lvl/support_turbo_quant
lyt/numpy_fix
lyt/omni
main
marlin_modify
mengni/bug_fix
mengni/expert
mengni/mengni/block_wise
mengni/mx_int4
mengni/vllm
mengni/vlm
mengniwang95-patch-1
more-ar-ext
mxfp8
origin/block_wise
patch/for/ao/581/stable
patch-for-ao-2
pre-release/internal-inc/w4a8
quant-attn-hpu
quant-attn-hpu-o-scale
quant-attn-hpu-pr
quant-llama
quarot-llama
qwen3-vl
qwen3_vl_moe
qwen-split
qwen-v5
refine-doc-table
replace-lm-head
revert_order
revert-318-fix/hpu/check
revert-1231-set_disable_opt_rtn_default_2_none
revert-1562-suyue/ut
save_memory
set_disable_opt_rtn_default_2_none
static_quant
support_gemma4
suyue/ci
test-git
try_new_optimizer
update_fp_compile
update_0522
update_0819
upstream-ao
use-ep
ut-time
v0.7.0rc
v0.7.1rc
v0.8.0rc
v0.8.0rc2
v0.9.1rc
v0.9.2-release
v0.9.2rc
v0.9.3rc
v0.9.4rc
v0.9.5rc
v0.9.6rc
v0.9.7rc
v0.10.0rc
v0.10.1rc
v0.10.2rc
v0.10.3rc
v0.12.0rc
v0.12.1rc
v0.12.2rc
w4a4_int_quaro
w4int8dynamic
wenhuach21-patch-1
wfp8-afp8-bk
xin3he-patch-1
xinhe/3-20c
xinhe/3-27c
xinhe/3-27d
xinhe/3-30a
xinhe/3-30
xinhe/3-31a
xinhe/4-7
Initial plan
Copilot
committed
80 days ago
deba2171
allow_deprecated_quantization and simplify UT to reduce time
xin3he
committed
80 days ago
a949c5cd
refactor eval and add UT
root
committed
80 days ago
8514a7dc
open source delta loss (#1300)
wenhuach21
committed
82 days ago
Verified
587eadb8
fix ignore layers regression (#1302)
wenhuach21
committed
82 days ago
Verified
3352a830
Remove itrex format (#1301)
Kaihui-intel
committed
82 days ago
Verified
46f88606
refine moe modellings to release orginal expert module's ram (#1265)
WeiweiZhang1
committed
84 days ago
Verified
e383eea3
Update unit test script (#1290)
XuehaoSun
committed
84 days ago
Verified
8e6f3370
Update version (#1282)
XuehaoSun
committed
84 days ago
Verified
47c8aa4c
auto-round-kernel installation method (#1221)
chensuyue
committed
88 days ago
Verified
ba48dd77
[vllm-ext] allows setting flashinfer workspace (#1259)
yiliu30
committed
88 days ago
Verified
6e1ea9cb
add permission for workflow (#1190)
chensuyue
committed
88 days ago
Verified
94db9057
Update HPU CI to gaudi v1.23.0 (#1271)
XuehaoSun
committed
89 days ago
Verified
56750e8a
gguf fix bug of tensors not on same tensor and gguf:q2_k_mixed (#1263)
n1ck-guo
committed
89 days ago
Verified
2c67712f
Fix critic bug causing GGUF to run on CPU (#1260)
wenhuach21
committed
90 days ago
Verified
1b7535a0
update supported scheme readme (#1257)
wenhuach21
committed
90 days ago
Verified
6a87a651
Update nightly release workflow (#1256)
chensuyue
committed
90 days ago
Verified
583278d9
Add nightly release workflow (#1255)
chensuyue
committed
91 days ago
Verified
b4ab21d7
GGUF format add support for MoE models with non-linear expert layers. (#1244)
n1ck-guo
committed
91 days ago
Verified
c7d6aee6
fix cuda ut fail (#1237)
n1ck-guo
committed
91 days ago
Verified
7a46aee8
Fix low cpu speed issue (#1251)
wenhuach21
committed
93 days ago
Verified
90db4445
WNA16 does not apply optimized RTN for moe layers by default (#1245)
wenhuach21
committed
94 days ago
Verified
9588bf90
Organize test directory structure with logical categorization (#1243)
Copilot
committed
94 days ago
Verified
0f6dc76e
Update setting of disable opt rtn (#1249)
WeiweiZhang1
committed
94 days ago
Verified
cc521cab
set disable_opt_rtn to optional bool and change default value to None (#1231)
WeiweiZhang1
committed
94 days ago
Verified
bef28c9a
[HPU]Export lazy mode env explicit (#1239)
yiliu30
committed
95 days ago
Verified
001d0e39
[API Change]rename fp_layers to ignore_layers (#1233)
wenhuach21
committed
95 days ago
Verified
0ad224c9
Add FP8 dtype info to quant config (#1228)
yiliu30
committed
95 days ago
Verified
afea815e
gguf add support for Mistral-Magistral-Devstral and Granite (#1226)
n1ck-guo
committed
96 days ago
Verified
6f2f9b9e
[STEP 2] refactor format and export (#1192)
n1ck-guo
committed
97 days ago
Verified
c2449059
Older