onnxruntime
Qnn weight sharing improvement
#23945
Merged

Commits
  • Improve Qnn EP weight sharing feature to make all generated ctx.onnx model point to the same ctx.bin during generation to avoid the post-processing work
    HectorSVC committed 1 year ago
  • update UT accordingly
    HectorSVC committed 1 year ago
  • format
    HectorSVC committed 1 year ago
  • update the tool
    HectorSVC committed 1 year ago
  • update the condition for enable_htp_spill_fill_buffer validation
    HectorSVC committed 1 year ago
  • update method name
    HectorSVC committed 1 year ago
  • remove enable_htp_weight_sharing from provider option. it can be decided from session option ep.share_ep_contexts. It is enabled if ep.share_ep_contexts for the QDQ model. And it's for x64 only.
    HectorSVC committed 1 year ago
  • fix UT by adding provider_options["soc_model"] = "60" since weight sharing is only available for v73 and higher
    HectorSVC committed 1 year ago
  • log warning if user want to enable weight sharing on device
    HectorSVC committed 1 year ago
  • update UT, remove duplicate test
    HectorSVC committed 1 year ago
  • format
    HectorSVC committed 1 year ago
  • resolve build issue on Linux
    HectorSVC committed 1 year ago
  • remove comments not accurate
    HectorSVC committed 1 year ago
Loading