Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
intel/auto-round
Pull Requests
Commits
actvation_quant
AutoAdamRound_bugfix
Chinesization
actvation_quant
add_task_args_for_lmeval
ark_zp
autoround_support_qbits_backend
bf16_scale
copilot/fix-corner-case-in-auto-round
copilot/fix-deprecated-fp-layers-handling
copilot/fix-docstrings-in-python-files
copilot/fix-issue-with-auto-rounding
copilot/fix-llm-type-70b-bits-setting
copilot/fix-typeerror-wrapped-fn
copilot/replace-getset-module-torch-api
copilot/sageattention
copilot/speedup-fp8-linear-convert
copilot/speedup-fp8-linear-convert-again
copilot/speedup-fp8-linear-convert-another-one
copilot/sub-pr-1237-again
copilot/sub-pr-1237
copilot/sub-pr-1324
copilot/sub-pr-1522-again
copilot/sub-pr-1532
copilot/update-user-settings-page
ddp
debug_time_cost
debug-nvfp4
deepseekv3
ds-qwen
ds-v5
ds-v32
enable_glm4_moe_lite_quantization
enable_llama4_int8_baseline
enable_llama4_quant
enable_mxfp_exporting
feat/activation-checkpointing
fix_bug0627
fix_bug_0722
fix_bug_1105
fix_disable_act_dynamic_usage_in_mxfp.py
fix_dq
fix/fp-layers-deprecation-mapping
fix_gemma3_issue
fix_gguf_fp8
fix_gptqmodel
fix_low_cpu
fix_save_quantized_func_nvfp_checker
fix_0107
fix_0109
fix_0113
fix-attn-mask-b60
fix-ds
fix-flashinfer
fix-gpt-oss
fix-hpu
fix-to-meta-assertion-error-1499
fixbug_0717
fp4_v2
fp4_v3
fp8-cache
fp8-cache-based-export
fp8_export_backup_stable
fp8_export_for_test
good-flux
hadamard_change
hengguo/fix_cuda_ut
hengguo/fix_gguf_ds
hengguo/new_ar_arch
hengguo/quantizers
hengguo/refactor_init
hengguo/refactor_quant_step1
hengguo/smoothquant
hengguo/w4afp8_sim
henguo/update_so
hpu_only_kg
hpu_only_pkg
hpu/only/v1
kaihui/torch_dtype
lazy-model-replace
leq_opub
lib/pre-4.4.0
llama/new/9-610
llama/new/9
llm-main
llmc
llmc-backup
llmc-test
lm-head-quant
load-kv
load-w8a8-replace-mod
load-w8a8
lvl/cpu_ram_optimization
lvl/fix_no_init_weights
lvl/general_moe_replacement
lvl/support_bagel_mot
lvl/support_fp8_with_ark
lvl/support_ovis_image
lvl/support_turbo_quant
lyt/numpy_fix
lyt/omni
main
marlin_modify
mengni/bug_fix
mengni/expert
mengni/mengni/block_wise
mengni/vllm
mengni/vlm
mengniwang95-patch-1
more-ar-ext
mxfp8
origin/block_wise
patch/for/ao/581/stable
patch-for-ao-2
pre-release/internal-inc/w4a8
quant-attn-hpu
quant-attn-hpu-o-scale
quant-attn-hpu-pr
quant-llama
qwen3-vl
qwen3_vl_moe
qwen-split
qwen-v5
refine-doc-table
replace-lm-head
revert_order
revert-318-fix/hpu/check
revert-1231-set_disable_opt_rtn_default_2_none
revert-1562-suyue/ut
save_memory
set_disable_opt_rtn_default_2_none
static_quant
suyue/ci
test-git
try_new_optimizer
update_fp_compile
update_0522
update_0819
upstream-ao
use-ep
ut-time
v0.7.0rc
v0.7.1rc
v0.8.0rc
v0.8.0rc2
v0.9.1rc
v0.9.2-release
v0.9.2rc
v0.9.3rc
v0.9.4rc
v0.9.5rc
v0.9.6rc
v0.9.7rc
v0.10.0rc
v0.10.1rc
v0.10.2rc
v0.10.3rc
v0.12.0rc
w4a4_int_quaro
w4int8dynamic
wfp8-afp8-bk
xinhe/3-20c
xinhe/3-27b
xinhe/3-27c
xinhe/3-27d
xinhe/3-30a
xinhe/3-30
xinhe/3-31a
xuehao/fix_hpu_perf_test
fix lm-head gradient accumulation bug (#113)
wenhuach21
committed
1 year ago
Verified
ecca5349
update shells (#112)
WeiweiZhang1
committed
1 year ago
Verified
3c214db0
Adjust the default evaluation data type by selecting from the model path configuration (#107)
WeiweiZhang1
committed
1 year ago
Verified
c42eaa92
20% speedup by removing new zero tensor (#110)
wenhuach21
committed
1 year ago
Verified
c7434b63
1.8X speedup by disable_low_gpu_mem_usage and reduce memory usage by avoid using torch.cat (#106)
wenhuach21
committed
1 year ago
Verified
23b60c3a
remove costly operations
wenhuach21
committed
1 year ago
Verified
ad3a7bba
Consolidate dataloader&dataset_split to dataset (#105)
wenhuach21
committed
1 year ago
Verified
1f9cb4f8
disable quantizing lm-head with tied weights as a workaround (#102)
wenhuach21
committed
1 year ago
Verified
1fe6aae5
disable quantizing lm-head with tied weights as a workaround (#101)
wenhuach21
committed
1 year ago
Verified
c839e825
update readme of calibration dataset and lm-head usage (#98)
wenhuach21
committed
1 year ago
Verified
f95b8c7d
fix critic bug for gradient_accumulate_steps!=1 and reduce cpu memory of lm-head tuning (#97)
WeiweiZhang1
committed
1 year ago
Verified
7dd02eb4
handle invalid layername in weight_config (#93)
WeiweiZhang1
committed
1 year ago
Verified
89562261
fix typo (#95)
yintong-lu
committed
1 year ago
Verified
9b08de48
deprecate use_quant_inp arg (#90)
yintong-lu
committed
1 year ago
Verified
1bf1b486
Add acc data (#89)
pursure-D
committed
1 year ago
Verified
8a3da144
fix old eval bug (#86)
WeiweiZhang1
committed
1 year ago
Verified
23d35e32
Update lm-head quantization readme
wenhuach21
committed
1 year ago
Verified
511a5385
add Yi-6b-chat results (#85)
yintong-lu
committed
1 year ago
Verified
d38a4a56
fix old eval tasks order (#78)
WeiweiZhang1
committed
1 year ago
Verified
257c5be2
Update llama3 acc (#84)
wenhuach21
committed
1 year ago
Verified
849cf9fe
support lm head quantizaiton and export to Intel cpu (#76)
wenhuach21
committed
1 year ago
Verified
16f9b7bd
fix bloom issue
pursure-D
committed
1 year ago
Verified
bd2fcc9f
update W2g32 accuracy (#74)
pursure-D
committed
1 year ago
Verified
16d830b4
Add baichuan-7b chat recipe (#73)
wenhuach21
committed
1 year ago
Verified
b68eaf72
fix eval bug of autogptq model (#72)
yintong-lu
committed
1 year ago
Verified
b22f6aad
fix filter func issue (#71)
wenhuach21
committed
1 year ago
Verified
900154bf
support combination of calibration datasets (#70)
wenhuach21
committed
1 year ago
Verified
e8fb5da1
[pre-commit.ci] pre-commit autoupdate (#69)
pre-commit-ci[bot]
committed
1 year ago
Verified
9f7e8b81
fix Baichuan2-13B issue (#37)
WeiweiZhang1
committed
1 year ago
Verified
fd3298b9
fix typo
wenhuach21
committed
1 year ago
Verified
7ff627b8
Newer
Older