fix nvfp act quantization bug (#891)
* fix nvfp act quantization bug
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* add cuda ut for moe nvfp quantize
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* add cpu UT, refine cuda UT
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix ut typo
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix cpu ut
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* enhance experts amax match, refine UT
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>