onnxruntime
Merge 'main' into 'win-ort-main' @ 39e585ff2b
#24353
Merged

Merge 'main' into 'win-ort-main' @ 39e585ff2b #24353

mschofie merged 515 commits into win-ort-main from mschofie/merge-1.22-current
mschofie
jchen351 Make Nuget package pipeline 1ES compliant (#23803)
839d9dcd
jchen351 Conveting npm packaging pipeline to 1ES (#23767)
9a2e0090
xhcao [webgpu] support resize operator (#23780)
cc3f4120
jchen351 Upgrade React Native to 0.73 (#23575)
40c329ef
jchen351 Make Nuget CUDA package pipeline 1ES compliant (#23804)
d5742708
fajin-corp [ARM CPU] Fix flaky hgemmb ut (#23814)
7a3810d3
yf711 [TensorRT EP] update oss parser to latest (#23710)
000f2c9f
daijh [webgpu] Fix alignment issues in shader code (#23776)
c6664e20
fs-eire upgrade emsdk to 4.0.4 (#23819)
6df0973e
ankitm3k [OVEP] Update support for Contrib Ops (#23789)
17f39475
Update onnxruntime_external_deps.cmake: add missing EXCLUDE_FROM_ALL …
b1f2a3f5
jambayk Quant tool: Add `nodes_to_exclude` in `get_qnn_qdq_config` (#23779)
5ab953cb
karim-vad [ORT/CI_Pipeline] Use --enable_generic_interface in ORT builds for EP…
05642657
jchen351 Increase npm package pipeline ReactNative_CI_iOS timeout to 120 mins …
a189bfca
fajin-corp [Mlas] Unblock hardcoded matmul blocking size (#23815)
c61a4b11
jchen351 Revert changes onn mac-react-native-ci-pipeline.yml (#23845)
2a4cfab4
sushraja-msft Fix flash attention for GQA (Phi4) (#23850)
1be64f88
skottmckay Model Builder API (#23223)
1088a1ed
seungtaek94 Fix typo: change `Upample` to `Upsample`. (#23838)
1ffe793a
co63oc [doc] Fix typos in csharp/src/Microsoft.ML.OnnxRuntime/ (#23848)
0a6b05fb
jambayk Quant tool: Consistent `get_qdq_config` and `get_qnn_qdq_config` beha…
daf9565d
HectorSVC Change the logic to generate the default ep context file name (#23788)
99c51a32
jchen351 Make Nuget QNN package pipeline 1ES compliant (#23805)
7f0c2c64
fs-eire [js/common] allows using Uint16Array as data for float16 tensor (#23827)
18725277
qjia7 [js/webgpu] Reland the optimization of ConvTranspose (#23858)
325ee309
asoldano [OpenVINO] Fix a build warning (#23877)
30c68254
Change gsl::byte to std::byte (#23872)
bde4fbec
fs-eire Allow using extended minimal build for several EPs (#23834)
17dcea7a
fs-eire Add dawn to ThirdPartyNotices (#23876)
813bdaab
HectorSVC Enable QNN EP weight sharing generation using public API (#23702)
9d0dc9f0
quic-ashigarg [QNN-EP]: Fix inference failures while running with htp_shared_memory…
788ca51b
jchen10 Fix enable_pix_capture build for WebGPU (#23857)
8aed9208
satyajandhyala [WebGPU-EP Native] Add ReduceMean (#23860)
834adde8
prathikr [WebGPU EP] introduce BiasAdd contrib op (#23861)
cfb0a72f
tianleiwu Dynamo export and improve benchmark script for SAM2 encoder (#23887)
5e636a67
fs-eire [js/web] improve workaround for bundlers (#23902)
aafa8d17
daijh [webgpu] Restore MatMulNBits workgroup size for Phi-3.5 (#23349)
d35db9b8
xhcao [webgpu] support Pad operator (#23141)
95225dda
Honry [WebNN] Accept Float16Array for float16 data type if it is available …
b5242293
mschofie Ensure that the 'cmake_minimum_required' is version 3.5 or greater (#…
996fffbe
jiangzhaoming WebGPU: Remove deprecated subgroups-f16 from WebGPU native and JS EP …
54b2d64c
satyajandhyala [JSEP/WebGPU] Fixed error in softmax dispatch. (#23906)
ccf8fdd9
fs-eire enable WebGPU EP in WebAssembly build (#23913)
101353cf
yihonglyu Adding OpenVINO Windows CI Pipeline (#23919)
8f077435
vraspar [WebGPU EP] SoftMax Implementation (#23538)
4bb79d13
skottmckay Exclude MAUI projects from GPU C# packaging builds (#23923)
b2ab87e8
sushraja-msft Support all block sizes that are multiples of 32 for DP4A (#23907)
eeaf73b3
skottmckay Example custom op with output type inferencing (#23916)
c28bf788
chilo-ms Enabling L2+ Optimizations for EPs (#23517)
1199dc08
fs-eire fix binplace file in web pipeline (#23930)
2ba076aa
yihonglyu Updated run_CIs_for_external_pr.py to support the Windows OpenVINO CI…
e47c6c16
skottmckay Fix ConvInteger handling of optional inputs. (#23935)
8969ee78
saurabhkale17 Updated ov version in pipeline (#595) (#23882)
26f590b3
ranjitshs [AIX] External data handling (#23859)
f25deaea
baijumeswani Create a packaging pipeline for a custom nuget package (#23918)
593d5c0e
skottmckay Fix license in example test code. (#23936)
7dbbfe08
fs-eire replace usage of gsl::narrow and gsl::narrow_cast in WebGPU EP (#23926)
ab38607d
VCPKG improvement: set VCPKG_OSX_DEPLOYMENT_TARGET (#23933)
cffef2e0
Allow using a different version of flatbuffers when building with vcp…
49328fe6
jchen351 Make python package pipeline 1ES compliant (#23800)
95dcd150
jchen351 Delete ROCM Nuget Publishing Pipeline (#23948)
989d4177
dependabot[bot] Bump SixLabors.ImageSharp from 2.1.9 to 2.1.10 in /csharp/sample/Micr…
fe7634eb
jchen351 Make python CUDA package pipeline 1ES compliant (#23802)
246c2191
jchen351 Migrate yarn to npm (#22116)
773bb4ff
satyajandhyala [WebGPU/JSEP] Support group query attention do_rotary attribute (#23524)
333fbdb4
jchen351 Fix npm audit in js/react-native/e2e (#23975)
f18e9faa
fs-eire Suppress some warnings in WebGPU EP generated by GCC 13 (#23984)
64436265
jchen351 Fix NPM audit in js/react-native (#23974)
d010acb5
dependabot[bot] Bump axios from 1.7.9 to 1.8.2 in /js/node (#23963)
9118b1de
stefantalpalaru GCC 14: fix insert_or_assign() call (#23955)
5672cf7d
ADD emsdk env vars to VCPKG_KEEP_ENV_VARS (#23997)
d2bf9a79
jchen351 Fix ONNX Runtime Python Test Pipeline (#23990)
fe435371
qjia7 [webgpu] Fix the continuation issue (#23999)
16d6f397
prathikr [WebGPU EP] Implements Gelu, BiasSplitGelu, and QuickGelu (#23981)
9891eb3d
satyajandhyala [Native WebGPU] Added ReduceMax and ReduceSum (#23934)
6dd6ef93
Convert Windows CPU CI Pipeline to Github Actions (#23996)
47bd0468
mingyueliuh [Fix] Dependencies find_package Eigen error (#23939)
06482c26
hsilm Update onnxruntime_c_api.h to work with MinGW (#24006)
5e057292
Add DNNL github workflow (#24011)
57ddd026
HectorSVC Qnn weight sharing improvement (#23945)
7ae606f7
hans00 Correct generated cmake syntax (#24016)
11216a4e
fs-eire [webgpu] allow to specify UseIndicesTypeAlias for Indices (#24019)
1362e7ca
fs-eire [webgpu] allow overloads to Program::AddIndices (#24021)
401f24a8
fs-eire fix test for RotaryEmbedding (#24022)
219c919c
tianleiwu Fix attention bias broadcast (#24017)
99b78a94
tianleiwu Remove unused parameter in csharp InferenceTest (#24031)
5bd31636
chilo-ms [TensorRT EP] Call cudaSetDevice at compute function for handling mul…
6bb6d791
edgchen1 Increase timeout for ARM64-Xcode16-targeting-iphonesimulator (#24030)
3f71d637
hans00 Support tvOS build (#24000)
1fc6d8ca
yf711 [TensorRT EP] Stop enforcing oss parser during Windows debug build (#…
cb3f631f
edgchen1 Set CMAKE_POLICY_DEFAULT_CMP0069 to NEW to ensure that IPO flags are …
9a296a0a
jchen351 Make Cuda packaging pipeline 1ES compliant (#23806)
9f214561
fs-eire [webgpu/wasm] allow runtime switch between WebGPUEP and JSEP (#24032)
7c05e7f5
edgchen1 Move call to MLAS_CPUIDINFO::GetCPUIDInfo() out of MlasSQNBitGemmDisp…
c9c8b48d
xhcao [webgpu] fix the wrong dispatch size in flash_attention (#24020)
cc5840be
fs-eire avoid copy unnecessary files for nodejs pkg (#23992)
41c239df
derdeljan-msft Add support for custom position ids and attention bias to GQA CPU ope…
5a694bcb
Honry [WebNN] Better int64 integration (#23831)
73d9826a
Convert Windows GPU pipelines and Windows OpenVino pipeline to Github…
b8966665
fajin-corp [ARM CPU] Fix fp16 const initialization on no-fp16 platform (#23978)
f22ee08f
satyajandhyala [Native WebGPU EP] Add packedQKV and do_rotary attribute support to G…
ae501eeb
kunal-vaishnavi Whisper Redesigned Solution (#23549)
7942fa7a
RyanUnderhill Windows: Show more useful DLL load errors to say exactly what DLL is …
5ef0d211
yf711 Extend CMAKE_CUDA_FLAGS with all Blackwell compute capacity (#23928)
2bc73ca8
jchen10 [WebGPU] Reduce staging buffers for uploading intializers (#23968)
f5812d0e
prathikr [WebGPU EP] Implement Remaining Reduction Ops (#24045)
154e3b7d
HectorSVC add bool support to EPContext schema to unblock some models (#24065)
a46d2127
prathikr [WebGPU EP] fix for reduce min/max error on MacOS CI (#24077)
b3aa5a3c
jchen351 Upgrade current MacOS-13 to 14 (#23293)
e495750a
skottmckay Fix CUDA EP Abs and Sign bfloat16 support (#23914)
c6a26754
justinchuby Improve typing for OrtValue and other public Python interfaces (#24086)
12fea572
qjia7 [webgpu] Limit that K must be divisible by 128 to apply dp4a matmul (…
a85977dd
fs-eire Add macOS ARM64 pipeline for webgpu (#24060)
d98046b3
egalli [WebNN/WebGPU JS] Fix shared Module methods overriding each other (#2…
eceae8b2
Enable multithreading on FP16 to FP32 cast operator (#23619)
7fc7d5ec
Move Android CI Pipeline to Github Actions (#24094)
3488ba39
Cleanup CoreML EP's code to remove COREML_ENABLE_MLPROGRAM (#23490)
7444feeb
guschmue webgpu ep support for argmax/argmin (#24089)
b626409e
carzh [mobile/reactnative] Remove namespace from AndroidManifest.XML to res…
d8ed4da1
fs-eire [WebGPU EP] fix implementation of Pow (#24088)
80441e4e
fs-eire Increase timeout to 90min for ARM64-Xcode16-targeting-iphonesimulator…
731b27e2
fs-eire [WebGPU] fix test failure in Reduce operators on macOS ARM64 (#24108)
da7874c8
prathikr [WebGPU EP] Implements CumSum Operator (#24047)
8d21bf72
qjia7 [webgpu] Use 1d dispatch group size (#24084)
81a89204
fs-eire [WebGPU] fix test failure in MatMulNBits on macOS ARM64 (#24109)
9dcb99cd
chuteng-quic [QNN-EP] Add support for Sum operator with 2 inputs (#24098)
4d5e274f
Honry [WebNN] Replace narrow with SafeInt for consistently in integer handl…
5d43f0ab
chuteng-quic [QNN-EP] Add Lora Support with offline QNN context binary (#24026)
6bdbf08c
yf711 [TensorRT EP] support TensorRT 10.9-GA (#23905)
440d17a7
qjia7 [webgpu] Apply dp4a for generation shader (#24064)
127c8503
tianleiwu [CUDA] Support slide window in cutlass fused attention (#24072)
db0c95c1
apwojcik [MIGraphX EP] rename HIPPinnedAllocator to MIGraphXPinnedAllocator (#…
16b0b323
apwojcik [MIGraphX EP] check POLICY CMP0144 availability before used (#24104)
9922d480
prathikr [JSEP] handles edge case in gridsample operator (#24121)
469fb7e3
sfatimar [OpenVINO]Session Options Appended After AppendExecutionProvider (#23…
49024a1e
jchen10 [webgpu]Add MaxPool and AveragePool (#23714)
7a6514c8
fs-eire [webgpu EP] put GetMaxComponents and SumVector to one place. (#24122)
9e53afab
tianleiwu skip MOE python test when MPI is not installed (#24116)
dcc1f5ac
MichaelTylerArm Integrate KleidiAI for MatMulNBits via MlasQNBitGemm (#23627)
90c5ffb5
fs-eire add test cases for webgpu ep in web (#24117)
0a363d9e
yuslepukhin Refactor Webnn IsSupported*() to use constant initializers. (#24118)
cd9406bf
CodingSeaotter Deleted the constant SKIP_CUDA_TEST_WITH_DML (#24113)
4959468a
tianleiwu Update T5 Onnx Export and Optimization (#23949)
d84314cb
jchen351 Update package.json to make the dist avaliable again (#23991)
3012d445
kunal-vaishnavi Fix attention QK linkage error (#24134)
2b3d7fb1
dependabot[bot] Bump next from 15.1.2 to 15.2.3 in /js/web/test/e2e/exports/testcases…
5ed900e9
pravg-amd [Shape Inference] Add shape inference for QLinearAdd and QLinearMul o…
2b5c9da6
carzh [mobile] Add Android NuGet BrowserStack test to NuGet packaging pipel…
8eb8c2b0
fajin-corp [CPU] Add fp16 support to sparse attention (#24015)
828e3726
fs-eire refactor mac CI pipelines (#24138)
373b9e2a
yuslepukhin Address Windows CUDA build issue (#24149)
5244d68b
fs-eire [webgpu] add option to perserve device and enable in unittest (#24115)
e03631ee
fs-eire [js/web] allow bundler import condition for not bundling wasm (#24014)
78d91cdd
fs-eire [js] Add API for accessing metadata of a model's input/output (#23937)
618aef7e
fs-eire add cache "onnxnodetests" for node tests (#24150)
afaf4a5e
vraspar [Native WebGPU] Add Matmul (#24046)
ce65e253
tianleiwu Upgrade Big Model pipeline CUDA from 11.8 to 12.x (#24156)
bb005b93
tianleiwu Proper Error Message when fp16 model is used for Beam Search in CPU (…
de502c89
jiafatom Change type len from int to size_t (#24157)
a4b8f11c
jchen351 Limit the Pipeline ability to build cuda 11 (#24073)
a8fb7868
Move Linux CPU CI pipeline to Github Actions (#24154)
86806677
dependabot[bot] Bump vite from 6.2.1 to 6.2.3 in /js/web/test/e2e/exports/testcases/v…
d9c961ce
edgchen1 [onnxruntime_perf_test] Fix custom_allocator_ destruction order. (#24…
1ef30446
fs-eire Fix layout transformer for FusedConv (#24169)
25b06f20
jchen351 Migrate Zip-Nuget Package Pipeline to 1ES (#23609) Also, kleidail is …
1f6dc881
Update the min GCC version (#24148)
9dbfee91
jywu-msft [QNN EP] ARM64EC python package remove --vcpkg in build (#24174)
2a800d1e
xiaofeihan1 [WebGPU EP] Add GEMM implementation (#24023)
a8673c6e
fs-eire [wasm] remove --vcpkg in wasm build (#24179)
513e8de1
fs-eire revise mac os pipeline to reduce the amount of jobs (#24177)
32b376cd
fs-eire fix triggering for "Validate Gradle Wrapper" pipeline (#24181)
be1cfc4e
HectorSVC upgrade QNN to version 2.32.0.250228 (#23977)
5d805c23
prathikr [JSEP] adjust edge case logic for scatternd (#24172)
24ece479
baijumeswani Make the custom nuget packaging pipeline 1ES commpliant. (#24191)
1f70fc25
edgchen1 Disable KleidiAI in Python Packaging pipeline MacOS build (#24194)
4d13b70f
jchen351 Rolling back the python/cuda (#24170)
041674ad
jchen351 Remove all CG template from pipelines (#24193)
914be22e
Move Linux ARM64 CI pipeline and Linux DNNL CI pipeline to Github Act…
bd00c39f
jchen10 [webgpu-ep] Fix test_batchnorm_example (#24184)
86b4c789
fs-eire Further reduce work load for Mac CI pipeline (#24197)
26566710
yuslepukhin Generate unique names for SliceSplit fusion. (#24217)
64b0d071
fs-eire Fix the pipeline that failed because of vcpkg (#24226)
25921476
peishenyan Improve Shape Inference for GQA (#24143)
c756e0ab
carzh Add React Native namespace back in for iOS (#24218)
19d8d69c
liqunfu RoPE fp16 avx (#23772)
180ba8f8
Migrate Linux GPU pipelines to Github Actions (#24232)
f430dce9
fs-eire Migrate Web CI into github actions (#24219)
41dde351
HectorSVC update the readme doc for the tool ep_weight_sharing_ctx_gen (#24233)
4a669fd1
prathikr [WebGPU EP] If Implementation for WebGPU EP (#24242)
7ef0ddc5
Update linux-dnnl.yml: rename the pipeline (#24240)
8de342ad
jchen10 [webgpu] Fix test_layer_normalization_2d_axis0 (#24223)
d71aa4d8
fs-eire [webgpu] fix LayerNorm with empty input (#24244)
f1d790c2
dependabot[bot] Bump actions/setup-python from 4 to 5 (#24251)
492af7a3
dependabot[bot] Bump actions/cache from 3 to 4 (#24250)
83650edc
edgchen1 [QNN EP] Add platform-agnostic EP option to specify QNN backend, `bac…
22787aec
xhcao [webgpu] Fix opset-12 softmax nhwc issue (#24227)
ad2e5652
fs-eire Extend pyright exclude list in pyproject.toml (#24246)
528f29a8
jing-bao [js/web] Add Wasm Relaxed SIMD support to wasm backend (#22794)
ba2999c5
fs-eire Add shader key validation step in WebGPU CI pipeline (#24243)
4eeefd72
fs-eire upgrade dawn version to 4cb1f9be152a4fa6bb695c08cd707ab078a1e2fb (#24…
30115cfe
dependabot[bot] Bump dsaltares/fetch-gh-release-asset from 1.1.0 to 1.1.2 (#24248)
5982430a
dependabot[bot] Bump vite from 6.2.3 to 6.2.4 in /js/web/test/e2e/exports/testcases/v…
e2274150
prathikr [WebGPU EP] fixes bugs in split implementation (#24259)
5068ab9b
dependabot[bot] Bump microsoft/onnxruntime-github-actions from 35f8bd42417991aa46577e…
1b48cc41
jchen351 Update xcode and iphoneSimulatorVersion after MacOS-14 (#24260)
5b080558
jchen351 Exclude onnxruntime-inference-examples directory from Component Gover…
24620e70
BoarQing [VitisAI] Fixed include error. (#24199)
67216c89
fs-eire Migrate pull:wasm to github action (#24269)
a5bc69c5
chilo-ms Ensure to use correct GPU device in RunSince when it's invoked by new…
b3793906
jchen351 Adding build-system to pyproject.toml (#24216)
b5d15bc9
prathikr [WebGPU EP] Implements ceil mode for Average Pool (#24270)
bc7b07db
Pin vcpkg version (#24284)
55aa03c1
toothache Support load TensorRT V3 plugin (#24211)
a14d586d
toothache Expose TRT preview features as EP option (#24212)
21db38c1
jchen10 [webgpu] test_layer_normalization_3d_axis0_epsilon (#24276)
8465ca38
fs-eire [webgpu][dawn API optimization] reduce number of calls to wgpuDeviceH…
7a551887
dependabot[bot] Bump next from 15.2.3 to 15.2.4 in /js/web/test/e2e/exports/testcases…
d2388135
dependabot[bot] Bump image-size from 1.1.1 to 1.2.1 in /js/react_native/e2e (#24278)
cbaa8bc7
zhaoxul-qti [QNN-EP] Enhance QNN-EP support for Softmax with opset < 13. (#24180)
a28da4b9
jchen351 Update publish-nuget.yml to correct feed. (#24299)
e5e906ee
daijh [webgpu] Optimize MatMulNBits for f16 Block32 prefill performance (#2…
3dfc2ae3
fs-eire upgrade action shellcheck to v1.30.0 (#24304)
82c8e569
minfhong-quic [QNN-EP] Fix ONNX context model helper. (#24271)
1cb53d00
fs-eire [WebGPU] fix Pad cache key (#24305)
318cc87f
dependabot[bot] Bump vite from 6.2.4 to 6.2.5 in /js/web/test/e2e/exports/testcases/v…
56f10183
fs-eire [WebGPU] fix cache key of AttentionProbs/VxAttentionScore (#24309)
2e94c5a4
titaiwangms Support Gemma3 with Clip fused attention (#24280)
e944379e
fs-eire Update packaging pipeline for Nodejs binding (#24301)
11fda2ad
sushraja-msft Add support for uint8_t as data type for GatherBlockQuantized (#24239)
a4976e33
satyajandhyala [Native WebGPU] Add Conv, ConTranspose and FusedConv (#24186)
9102aaee
fs-eire [webgpu][dawn API optimization] reduce number of calls to wgpuDeviceG…
a7e62d63
virajwad Fix 'minimal_power' to 'minimum_power' for DirectML performance selec…
55c1a3b0
satyajandhyala Add ConvTranspose cache key (#24317)
d6df4f29
qjia7 [webgpu] Use 1D dispatch groups for attention (#24228)
a1186f63
fs-eire [webgpu][dawn API optimization] reduce number of calls to buffer APIs…
73676fc5
yuslepukhin Implement load cancellation ability (#24257)
350d1400
xhcao [webgpu] Fix ROUND_PREFER_CEIL issue of Resize operator (#24229)
ca1b32df
satyajandhyala [Native WebGPU] Exclude WebGPU EP from ConvFp16 3D tests. (#24327)
b803429a
zz002 [VitisAI EP] export InferShapes to VitisAIEP (#23881)
554fb4ad
qjia7 [webgpu] Flash attention for generation (#23808)
18f91e55
fanchenkong1 Use WASM f32x4 relaxed min/max for relaxed simd build (#24324)
04e0b50c
guschmue webgpu support for DequantizeLinear (#24268)
f83e6618
xhcao [webgpu] fix the reflect mode issue of Pad (#24202)
10e51d26
kevinch-nv Remove explicit batch network flag for TRT 10+ (#24298)
4edada60
jchen10 [webgpu] Fix bias_split_gelu (#24342)
22656134
jchen10 [webgpu] fix bias-add (#24336)
34abb8b4
xhcao [webgpu] optimize SkipLayerNormalization operator (#24164)
0acb0488
jagadish-amd ROCm: Remove -Wno-interference-size compiler flag (#24326)
d7a38a57
fs-eire [web] revise flag `ort.env.wasm.simd` (#24314)
39e585ff
mschofie Merge 'main' into 'win-ort-main' @ "39e585ff2b:[web] revise flag `ort…
529a34e6
mschofie mschofie requested a review from ashrit-ms ashrit-ms 1 year ago
mschofie mschofie requested a review 1 year ago
snnn
mschofie
mschofie mschofie merged f03e3c7d into win-ort-main 1 year ago
mschofie mschofie deleted the mschofie/merge-1.22-current branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone