onnxruntime
[CPU/CUDA EP] Add DeformConv op support
#27393
Merged

[CPU/CUDA EP] Add DeformConv op support #27393

ShirasawaSama
ShirasawaSama
ShirasawaSama ShirasawaSama changed the title Feature/add deform conv 2d support Add deform conv 2d support 87 days ago
ShirasawaSama
fs-eire fs-eire requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 86 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2026-02-20
ShirasawaSama ShirasawaSama changed the title Add deform conv 2d support [CPU/CUDA EP] Add DeformConv op support 83 days ago
ShirasawaSama
ShirasawaSama ShirasawaSama force pushed from 2d85c093 to bb17da52 80 days ago
github-advanced-security
github-advanced-security commented on 2026-02-28
tianleiwu
tianleiwu commented on 2026-02-28
ShirasawaSama ShirasawaSama force pushed from bb17da52 to 7d2b779e 77 days ago
ShirasawaSama ShirasawaSama marked this pull request as draft 77 days ago
ShirasawaSama ShirasawaSama force pushed from 1e5babad to 1222ad4a 77 days ago
github-advanced-security
github-advanced-security commented on 2026-03-01
ShirasawaSama
ShirasawaSama ShirasawaSama marked this pull request as ready for review 75 days ago
ShirasawaSama ShirasawaSama requested a review from tianleiwu tianleiwu 75 days ago
github-advanced-security
github-advanced-security commented on 2026-03-05
ShirasawaSama
tianleiwu
ShirasawaSama
ShirasawaSama ShirasawaSama marked this pull request as draft 73 days ago
ShirasawaSama
ShirasawaSama
ShirasawaSama ShirasawaSama marked this pull request as ready for review 71 days ago
ShirasawaSama ShirasawaSama force pushed from 9d7d29dc to cbe1eca8 71 days ago
ShirasawaSama
ShirasawaSama
tianleiwu
tianleiwu dismissed these changes on 2026-03-11
tianleiwu tianleiwu requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 67 days ago
tianleiwu tianleiwu dismissed their stale review 67 days ago
please address remaining issues
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2026-03-11
ShirasawaSama
ShirasawaSama ShirasawaSama marked this pull request as draft 66 days ago
ShirasawaSama
ShirasawaSama ShirasawaSama marked this pull request as ready for review 66 days ago
ShirasawaSama
ShirasawaSama
ShirasawaSama
ShirasawaSama
ShirasawaSama
ShirasawaSama ShirasawaSama force pushed from 7adfb05a to c5d86e71 64 days ago
ShirasawaSama
ShirasawaSama
tianleiwu tianleiwu requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 62 days ago
tianleiwu
azure-pipelines
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2026-03-16
ShirasawaSama
tianleiwu
tianleiwu dismissed these changes on 2026-03-16
ShirasawaSama ShirasawaSama dismissed their stale review via d7127544 62 days ago
ShirasawaSama
ShirasawaSama
ShirasawaSama
tianleiwu
tianleiwu
tianleiwu dismissed these changes on 2026-03-17
azure-pipelines
tianleiwu tianleiwu enabled auto-merge (squash) 60 days ago
tianleiwu
ShirasawaSama
disabled auto-merge 60 days ago
Head branch was pushed to by a user without write access
ShirasawaSama ShirasawaSama dismissed their stale review via 454eea1d 60 days ago
tianleiwu
azure-pipelines
ShirasawaSama Add deform conv 2d cpu execution provider support
4790b094
ShirasawaSama Add more tests
abfec39e
ShirasawaSama Add cuda support for deformconv2d
a0c50604
ShirasawaSama Improve deformconv cuda pref
dd8e7f1c
ShirasawaSama Add more test cases
c5bd48af
ShirasawaSama Fix copilot suggestions
952b3a12
ShirasawaSama Fix default attrs value of DeformConv
e5c043c6
ShirasawaSama Fix schema definition for DeformConv op
eee517da
ShirasawaSama Refactor DeformConv test cases
12b19c8a
ShirasawaSama Fix OrtMemTypeCPUInput issue and add cuda error check
d6c19be5
ShirasawaSama Remove GemmEx double specialization
12fd042b
ShirasawaSama Fix potential integer overflow in CUDA DeformableIm2ColKernel
9b069e33
ShirasawaSama Optimize CPU DeformableIm2Col loop order for better cache locality
cbadf131
ShirasawaSama Parallelize CPU DeformConv Im2Col and bias addition
a9515683
ShirasawaSama Use GPU free memory in DeformConv temp memory heuristic
f1a98325
ShirasawaSama Extract DeformConvAttributes to shared header
d99994ff
ShirasawaSama DeformConv op shared attributes and validation
7d7f66ea
ShirasawaSama Refactor attributes/validation and optimize CUDA DeformConvIm2Col kernel
8b5a13f5
ShirasawaSama Add DeformConv OnnxModelTest with reference ONNX model
e5ec6def
ShirasawaSama Optimize GetGreatestDivisorBelowBound in CUDA DeformConv
14cf455d
ShirasawaSama Document symmetric-padding-only limitation in deform_conv_test_gen
4121178a
ShirasawaSama Skip cuda DeformConv op copy kernel when cur_parallel==1
df9d0b10
ShirasawaSama Reformat code
03cc5e5b
ShirasawaSama Fix cuda fp16 test cases
d9f65fb7
ShirasawaSama Fix int64_t to ptrdiff_t conversion in deform_conv
15fe856f
ShirasawaSama Resolve pipeline failures caused by unit tests
931c3862
ShirasawaSama Add comments and handle unused variables
fedd3898
ShirasawaSama Address review feedback and align with Conv behavior
7ebc4988
ShirasawaSama Optimize DeformConv cpu bias add with Eigen SIMD
0479aded
ShirasawaSama Document GEMM layout trick in DeformConv cuBLAS path
f7819f14
ShirasawaSama Use int64_t for bilinear interpolation indices
34fae7d1
ShirasawaSama refactor(DeformConv CPU): template UseMask and improve im2col perform…
173fd6be
ShirasawaSama perf(DeformConv CPU): optimize im2col and BilinearInterpolate
33e4866b
ShirasawaSama Early OOB check for BilinearInterpolate
a482eb5e
ShirasawaSama Shrink DeformConv CUDA mutex to UpdateState only
b46f922c
ShirasawaSama Use cublasGemmStridedBatched for gemm_writes_directly path in DeformC…
82d12283
ShirasawaSama Early OOB check for BilinearInterpolate
a482eb5e
ShirasawaSama Shrink DeformConv CUDA mutex to UpdateState only
b46f922c
ShirasawaSama Fix var name
da18ee30
ShirasawaSama Drop mask==0 branch in im2col to match CPU behavior
dcd00c30
ShirasawaSama Fix C4244 in deform_conv_op_test by casting rtol/atol to float
f2d8f5df
ShirasawaSama Use cached totalGlobalMem for temp budget, remove cudaMemGetInfo and …
a25c1f4a
ShirasawaSama Document int indices in CUDA BilinearInterpolate
d98f3399
ShirasawaSama Document prime-batch fallback to single-image chunks in DeformConv Ge…
c2bf6f40
ShirasawaSama Refine deform conv test generator imports and ONNX model save usage
a8920b42
ShirasawaSama Optimize DeformConv CPU bilinear interpolation
b7b46813
ShirasawaSama Enforce 2D attribute lengths and validate kernel_shape/pads/overflow-…
46f176c4
ShirasawaSama Clarify DeformConv OnnxModelTest comment as ORT-reference smoke test
7cd167b2
ShirasawaSama DeformConv EmptyBatch test expects failure when batch size N is zero
ada5ca39
ShirasawaSama Allow DeformConv empty batch
17b155ac
ShirasawaSama Update docs
288e4c02
ShirasawaSama ShirasawaSama force pushed from 454eea1d to 288e4c02 55 days ago
ShirasawaSama
tianleiwu tianleiwu enabled auto-merge (squash) 55 days ago
tianleiwu
tianleiwu approved these changes on 2026-03-23
tianleiwu
azure-pipelines
tianleiwu tianleiwu merged 163f6149 into main 55 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone