Enable mxfp exporting (#649)
* enable mxfp export
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* add export dir
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* enable nvfp4 Non-llama structure model packing
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* fix pack_layer register
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* add global_scale alignment for fuse modules
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* fix scan issue
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* fix RTN fuse weights bug
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* add static act quant, support fuse for inpput_global_scale
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fixtypo
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix WAlayer pack
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* fixtypos
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* bugfix
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* fix multi-gpu device bug
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add moe experts act_max autocomplete
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix typo, add llm-compressor config
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* fixtypo
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix act_max moe issue for nvfp
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* fix moe static act_quant issue
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update nvfp & mxfp quant, add llmcompressor nv_fp/mx_fp format export
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine export, fixtypo, add UT
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fixtypo
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* fix plint
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix guff UT
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* rm torch_compile from nvfp funcs
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>