Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
intel/auto-round
Pull Requests
Commits
fix-gpt-oss
AutoAdamRound_bugfix
Chinesization
actvation_quant
add_task_args_for_lmeval
ark_zp
autoround_support_qbits_backend
bf16_scale
copilot/fix-corner-case-in-auto-round
copilot/fix-deprecated-fp-layers-handling
copilot/fix-issue-with-auto-rounding
copilot/fix-llm-type-70b-bits-setting
copilot/fix-typeerror-wrapped-fn
copilot/replace-getset-module-torch-api
copilot/speedup-fp8-linear-convert
copilot/speedup-fp8-linear-convert-again
copilot/speedup-fp8-linear-convert-another-one
copilot/sub-pr-1237-again
copilot/sub-pr-1237
copilot/sub-pr-1324
copilot/update-user-settings-page
ddp
debug_time_cost
debug-nvfp4
deepseekv3
ds-qwen
ds-v5
ds-v32
enable_glm4_moe_lite_quantization
enable_llama4_int8_baseline
enable_llama4_quant
enable_mxfp_exporting
fix_bug0627
fix_bug_0722
fix_bug_1105
fix_disable_act_dynamic_usage_in_mxfp.py
fix_dq
fix/fp-layers-deprecation-mapping
fix_gemma3_issue
fix_gguf_fp8
fix_gptqmodel
fix_low_cpu
fix_save_quantized_func_nvfp_checker
fix_0107
fix_0109
fix_0113
fix-attn-mask-b60
fix-ds
fix-flashinfer
fix-gpt-oss
fix-hpu
fixbug_0717
fp4_v2
fp8-cache
fp8-cache-based-export
fp8_export_backup_stable
fp8_export_for_test
hengguo/fix_cuda_ut
hengguo/fix_gguf_ds
hengguo/gguf_transformers5.0
hengguo/quantizers
hengguo/refactor_init
hengguo/refactor_quant_step1
hengguo/smoothquant
hengguo/w4afp8_sim
henguo/update_so
hpu_only_kg
hpu_only_pkg
hpu/only/v1
hpu-v4
kaihui/torch_dtype
lazy-model-replace
leq_opub
lib/pre-4.4.0
llama/new/9-610
llama/new/9
llm-main
llmc
llmc-backup
llmc-test
lm-head-quant
load-kv
load-w8a8-replace-mod
load-w8a8
lvl/cpu_ram_optimization
lvl/fix_no_init_weights
lvl/fix_transpose_conversion_issue
lvl/general_moe_replacement
lvl/ram_usage_optimization
lvl/support_omni
lyt/numpy_fix
lyt/omni
main
marlin_modify
mengni/bug_fix
mengni/expert
mengni/vllm
mengni/vlm
mengniwang95-patch-1
more-ar-ext
mxfp8
patch/for/ao/581/stable
patch-for-ao-2
pre-release/internal-inc/w4a8
quant-attn-hpu
quant-attn-hpu-o-scale
quant-attn-hpu-pr
quant-llama
qwen3-vl
qwen3_vl_moe
qwen-split
qwen-v5
refine-doc-table
replace-lm-head
revert_order
revert-318-fix/hpu/check
revert-1231-set_disable_opt_rtn_default_2_none
save_memory
set_disable_opt_rtn_default_2_none
static_quant
support_qwen35
test-git
try_new_optimizer
update_fp_compile
update_0522
update_0819
upstream-ao
use-ep
ut-time
v0.7.0rc
v0.7.1rc
v0.8.0rc
v0.8.0rc2
v0.9.1rc
v0.9.2-release
v0.9.2rc
v0.9.3rc
v0.9.4rc
v0.9.5rc
v0.9.6rc
v0.9.7rc
v0.10.0rc
v0.10.1rc
v0.10.2rc
w4a4_int_quaro
w4int8dynamic
wfp8-afp8-bk
xinhe/fix_ci
xinhe/gpt-oss
xinhe/qwen-nvfp4
xinhe/tmp
xinhe/2-12a
xinhe/2-26
xuehao/cuda-ci
disable packing immediate
yiliu30
committed
115 days ago
d95c7141
fix gpt-oss mem
yiliu30
committed
115 days ago
3842867b
update
yiliu30
committed
116 days ago
0354c2ba
remove time
yiliu30
committed
116 days ago
b992c319
fix
root
committed
116 days ago
2bd3c4b1
fix offloaf
root
committed
116 days ago
553ee5c8
refactor
root
committed
117 days ago
a20f9df7
refactor
root
committed
118 days ago
7a1716e0
Merge branch 'llmc' of https://github.com/intel/auto-round into llmc
yiliu30
committed
118 days ago
2f96c13f
refine code
yiliu30
committed
118 days ago
60a00232
add more log
yiliu30
committed
118 days ago
db65d74b
return ds
yiliu30
committed
118 days ago
361491f7
tmp wa for llmc
yiliu30
committed
121 days ago
8832530c
tmp wa for llmc
yiliu30
committed
121 days ago
ce985efc
enhance flux doc (#967)
mengniwang95
committed
121 days ago
Verified
7635f7ea
fix rtn bug (#966)
mengniwang95
committed
121 days ago
Verified
c4ef9a82
fix bug of imatrix contains 0 (#955)
n1ck-guo
committed
121 days ago
Verified
5e33cbce
[1/N] Initial vllm-ext evaluation support (MXFP4 MOE) (#935)
yiliu30
committed
121 days ago
Verified
e8bc3536
fix critic disable_opt_rtn regression (#963)
wenhuach21
committed
121 days ago
Verified
282aab66
update readme (#962)
wenhuach21
committed
122 days ago
Verified
7d8016d9
mark round method as todo
yiliu30
committed
122 days ago
77844f6b
fix
yiliu30
committed
122 days ago
ad8537c6
Merge branch 'main' into vllm-ext
yiliu30
committed
122 days ago
8f270411
refine AutoScheme readme/code (#958)
wenhuach21
committed
122 days ago
Verified
8ac82a4a
add logo (#960)
wenhuach21
committed
122 days ago
Verified
eb2facd9
add self attribution and fix avg_bits error (#956)
xin3he
committed
122 days ago
Verified
12c49846
Reduce AutoSchem VRAM usage by up to 10X (#944)
wenhuach21
committed
123 days ago
Verified
90c2fb4c
update gguf and support for CompressedLinear (#950)
n1ck-guo
committed
123 days ago
Verified
fbb9c13b
update readme for sglang support (#953)
WeiweiZhang1
committed
123 days ago
Verified
824a21fb
refactor utils file (#943)
n1ck-guo
committed
124 days ago
Verified
f1b5c72b
Older