Pull Requests microsoft/onnxruntime

Weightless support for all initializers

#29607 opened 2026-07-07 23:07 by chilo-ms

Follow standard CMake semantics for CUDA architecture suffixes

#29601 opened 2026-07-07 19:49 by Sammy-Dabbas

fix(webgpu): use f32 accumulators in fp16 MatMul/MatMulNBits to prevent overflow (#26732)

#29599 opened 2026-07-07 15:40 by RobertoReale

Additional validation in LSTM and DynamicQuantizedLSTM

#29595 opened 2026-07-07 08:47 by skottmckay

Add Intel WebGPU subgroup-matrix MatMul implementation

#29592 opened 2026-07-07 05:50 by jchen10

Fix usage of stream for memory pattern allocation with arena

#29589 opened 2026-07-07 04:24 by skottmckay

[CUDA] Add persistent fpA_intB MatMulNBits tactic autotune cache

#29588 opened 2026-07-07 03:50 by tianleiwu

webgpu: add MatMulBnb4 contrib op support

#29587 opened 2026-07-07 03:28 by xhcao

Add missing rank/length validation to pooling operators (fix OOB reads)

#29579 opened 2026-07-06 21:39 by titaiwangms

Bump transformers from 4.50.0 to 5.3.0 in /onnxruntime/python/tools/transformers/models/stable_diffusion/requirements dependencies python

#29578 opened 2026-07-06 21:16 by dependabot[bot]

Follow-up (#29504): guard C# node-test symlink against silent-green + document JS post-#7959 opset constraint

#29577 opened 2026-07-06 19:09 by titaiwangms

Update OpenVINO NPU dynamic external-initializer expectation

#29572 opened 2026-07-06 07:36 by GopalakrishnanN

Add dynamic-shape memory-pattern eligibility test

#29570 opened 2026-07-06 05:48 by GopalakrishnanN

Add verbose trace logs for graph transformer manager

#29569 opened 2026-07-06 05:38 by GopalakrishnanN

Add CPU-only MemcpyTransformer test with fake device EP

#29568 opened 2026-07-06 05:32 by GopalakrishnanN

Add arena shrinkage run-option diagnostics test

#29567 opened 2026-07-06 05:25 by GopalakrishnanN

Add allocation planner test for disabled memory reuse

#29565 opened 2026-07-06 05:12 by GopalakrishnanN

Add cross-layer include ratchet

#29564 opened 2026-07-06 04:57 by GopalakrishnanN

Add disabled-test count ratchet

#29563 opened 2026-07-06 04:56 by GopalakrishnanN

Add C-API exception-boundary checker

#29562 opened 2026-07-06 04:54 by GopalakrishnanN

Include originating function in C-API catch-all exception message

#29561 opened 2026-07-06 04:44 by GopalakrishnanN

Make element-wise fp16 tests skip explicitly when CUDA/CoreML are unavailable

#29560 opened 2026-07-06 04:22 by GopalakrishnanN

Extract InferenceSession input/output validation into free functions

#29559 opened 2026-07-06 03:44 by GopalakrishnanN

Add model-ingestion error-path tests for InferenceSession

#29558 opened 2026-07-06 02:31 by GopalakrishnanN

[WebGPU] Deferred-dispatch to parallelize cold-start shader compilation

#29557 opened 2026-07-06 02:11 by xiaofeihan1

Fix dead dedup guard in GraphTransformerManager::Register (audit F10)

#29556 opened 2026-07-06 02:05 by GopalakrishnanN

Skip GroupQueryAttentionFusion when GQA node has inputs beyond sin_cache

#29552 opened 2026-07-05 18:47 by Copilot

Fix uninterpolated error messages in per-channel TensorQuantOverrides validation

#29551 opened 2026-07-05 18:33 by nileshpatil6

Skip virtual DRM devices without PCI vendor info during Linux GPU dicovery

#29543 opened 2026-07-04 10:14 by jcubic

Fuse BatchNormalization into ConvTranspose

#29542 opened 2026-07-04 02:53 by the0cp