PR #23945 Qnn weight sharing improvement

Improve Qnn EP weight sharing feature to make all generated ctx.onnx model point to the same ctx.bin during generation to avoid the post-processing work

HectorSVC committed 1 year ago

update UT accordingly

HectorSVC committed 1 year ago

format

HectorSVC committed 1 year ago

update the tool

HectorSVC committed 1 year ago

update the condition for enable_htp_spill_fill_buffer validation

HectorSVC committed 1 year ago

update method name

HectorSVC committed 1 year ago

remove enable_htp_weight_sharing from provider option. it can be decided from session option ep.share_ep_contexts. It is enabled if ep.share_ep_contexts for the QDQ model. And it's for x64 only.

HectorSVC committed 1 year ago

fix UT by adding provider_options["soc_model"] = "60" since weight sharing is only available for v73 and higher

HectorSVC committed 1 year ago

log warning if user want to enable weight sharing on device

HectorSVC committed 1 year ago

update UT, remove duplicate test

HectorSVC committed 1 year ago

format

HectorSVC committed 1 year ago

resolve build issue on Linux

HectorSVC committed 1 year ago

remove comments not accurate

HectorSVC committed 1 year ago

onnxruntime Qnn weight sharing improvement #23945 Merged

onnxruntime
Qnn weight sharing improvement
#23945

Merged