cherry pick outstanding commits (#7871)
* Fix bug in Transpose CUDA kernel (#7329)
* Fix permission error for ORTModule lock file (#7814)
* fix topo sort in quant tool (#7833)
* fix topo sort in quant tool
* add unit test and make the topo sort stable
* Relax tol for Conv1D fp16 test (#7844)
* Relax tol for Conv1D fp16 test
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Resolve issue with wrapped ORTModule load_state_dict (#7847)
* Encapsulate children modules inside a ModuleAccessor object to prevent erroneuos iteration over children while loading the state dictionary
* Add named_models, models, apply methods, change ModuleAccessor to ModuleMetadata and modify unit tests
* Change ModuleMetadata module getter logic, raise NotImplementedError for add_modules
* Add comment explaining why overriding _load_from_state_dict method is needed
* fixed bugs in packed mode and enable pack mode tests in ci (#7848)
* fixed bugs in packed mode and enable pack mode tests in ci
* removed unnecessary space
* pr comments
* pr comments
* disable an average pool test
* try disabling another avg pool
* disable more avg pool tests
* disable maxpool tests
* add environment variable to control default training package's local version (#7849)
* [js] update documents (#7852)
* [js] update documents
* escape double quotes
* update operators.md
* resolve comments
* Support bool type for Pad CPU (#7856)
* Initial commit
* update
* nit
* Include ORT C/C++ API headers in the ORT Mobile AAR package (#7858)
* Add header files of ort c/c++ api to aar package
* Move header file selection to cmake based on EP choice
* fix duplicated node name (#7865)
* Clean up CPU kernel definition for opset 13 Pad (#7867)
Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
Co-authored-by: Yufeng Li <liyufeng1987@gmail.com>
Co-authored-by: Sherlock <baihan.huang@gmail.com>
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: baijumeswani <bmeswani@microsoft.com>
Co-authored-by: Tixxx <tix@microsoft.com>
Co-authored-by: liqunfu <liqfu@microsoft.com>
Co-authored-by: Yulong Wang <yulongw@microsoft.com>
Co-authored-by: Guoyu Wang <62914304+gwang-msft@users.noreply.github.com>
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>