onnxruntime
cherry picks for 1.23.0 release
#25959
Merged

cherry picks for 1.23.0 release #25959

jywu-msft merged 20 commits into rel-1.23.0 from tlwu/1.23_cherry_pick
tianleiwu
tianleiwu [CUDA] Support SwiGlu in MoE and qMoE (#25530)
b3664f88
tianleiwu [CUDA] BF16 MoE and qMoE (#25572)
a8e1186e
xiaomsft Add CUDA implementation of GatherBlockQuantized operator (#25575)
a9f74a04
apsonawane Add support for QMoE in CPU (#25558)
d83904bb
tianleiwu Update MoE and qMoE spec (#25619)
86542411
apsonawane [CPU] Improve QMoE kernel (#25822)
6ca2047b
apsonawane Fix MoE CPP tests (#25877)
dd32daf7
psakhamoori Add custom ops library_path to EP metadata (#25830)
581b8e74
mingyueliuh [Fix] illegal memory access in GetInputIndices with optional inputs (…
a9308a16
gedoensmax [TRT RTX EP] Add sync method (#25898)
6c7f150d
gedoensmax [TRT RTX EP] Memory map the engine buffer (#25909)
535fcc62
gedoensmax [TRT RTX EP] Add support for RTX runtime caches (#25917)
1f4e581a
adrianlizarraga Compile API: disable optimizations by default (#25474)
9732a3e5
yuslepukhin [CXX] Introduce C++ API for new C entry points (#25897)
df25f456
kobby-kobbs Migrate model tests to ONNX Model ZOO only (#25888)
8f587b13
yuslepukhin Remove std::string::data() non-const usage from public headers (#25943)
ab71f1e1
adrianlizarraga Compile API: output model and initializer stream write functions (#25…
2d36f04b
praneshgo [TRT RTX EP] Fixing the stream parameter in CopyTensors API and passi…
c5096d9b
hariharans29 [MLAS] Add 8-bit weights ARM64 Gemm implementation (#25110)
5ee309eb
ishwar-raut1 [NV TensorRT RTX] Handle unsupported data types (#25953)
157df9c6
adrianlizarraga
adrianlizarraga approved these changes on 2025-09-05
jywu-msft
jywu-msft approved these changes on 2025-09-05
jywu-msft jywu-msft merged 491f0c19 into rel-1.23.0 254 days ago
jywu-msft jywu-msft deleted the tlwu/1.23_cherry_pick branch 254 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone