auto-round
57488ab9 - refactor(ark): drop INT8 asym DPAS; add INT4/INT2 sym via INT8 DPAS

Commit
3 days ago
refactor(ark): drop INT8 asym DPAS; add INT4/INT2 sym via INT8 DPAS Roll back the INT8 asym DPAS path (perf regressed vs. dequant fallback on hardware). Add INT4-sym and INT2-sym prefill paths that upcast the packed weights into an int8_t [E, N, K] view inside the existing dequant workspace and dispatch through the same per-group INT8 DPAS mainloop the S8-sym branch uses, reusing the packed scale tensor unmodified.
Author
Parents
Loading