auto-round
c4a14799 - Support mxfp nvfp lmhead quant (#1051)

Commit
27 days ago
Support mxfp nvfp lmhead quant (#1051) * fp8 exporting bugfix Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * refine exllama backend cuda UT Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * add lm_head layer act_max hook, enable mxfp/nvfp lm_head export Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixtypo Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fixtypo Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix ut typo Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refine logs, fix pack_layer for awq&gptq Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * refine log, fix pack_layer for awq&gptq Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add awq&gptq lm_head UT Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix local path Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> --------- Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Author
Parents
Loading