onnxruntime
Whisper Redesigned Solution
#23549
Merged

Whisper Redesigned Solution #23549

kunal-vaishnavi
kunal-vaishnavi Add support for creating optimized whisper ONNX models without beam s…
f3142875
kunal-vaishnavi Fix incorrect dynamic axes labels
6a44f72c
kunal-vaishnavi Fix fusion breaks for OpenAI implementation of Whisper
58ec5eb5
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
4c228ea3
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
dd20876a
kunal-vaishnavi Comment out DMMHA case temporarily
b13cb22f
kunal-vaishnavi Replace MHA with DMMHA
31db1a03
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
3b924328
kunal-vaishnavi Debugging beam search output
7bb79f30
kunal-vaishnavi Initial commit for new export
14b7e77f
kunal-vaishnavi Add parity check after export and optimization
fa345fec
kunal-vaishnavi Fix multiple attention kernel invocations
e050dea4
kunal-vaishnavi Make output Q*K values optional
bf87062a
kunal-vaishnavi Fix batch size check for cache indirection
17fa0ab8
kunal-vaishnavi Save checkpoint for working solution
52aeb58e
kunal-vaishnavi Clean up code
240fe3b3
kunal-vaishnavi Fix string dumping
ae980850
mindest Fix out_qk dtype issue for half input case.
3d2c8fe6
kunal-vaishnavi Remove type cast for output QK
287151ff
kunal-vaishnavi Enable release mode build
0805d1d2
kunal-vaishnavi Make QK output dtype independent of attention dtype
b6299035
kunal-vaishnavi Add batched jump times export
648b3899
kunal-vaishnavi Get batched jump times ONNX model with parity check
a6c6ee8c
kunal-vaishnavi Save checkpoint for working solution
c0a6ce45
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
008eeb96
kunal-vaishnavi Fix build after merge
158d0a84
kunal-vaishnavi Fix model with beam search op
02cb5be7
kunal-vaishnavi Get model impl and beam search op export combinations working
2acd593c
kunal-vaishnavi Enable separate export of encoder and decoder init
612eb0c3
kunal-vaishnavi Add tests for multiple export types to CIs
f2d78fd7
kunal-vaishnavi Update folder and file names in Whisper README
cb935170
kunal-vaishnavi Add FP32 CPU DMMHA support
6da11ec8
kunal-vaishnavi Add unit tests
9640736c
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
75a342ad
kunal-vaishnavi Change debug message for PrepareQkv
7fe6b05e
kunal-vaishnavi Fix seqlens_k after merge
86201689
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
b0a732ba
kunal-vaishnavi Add changes suggested by linter
23808f73
github-advanced-security
github-advanced-security commented on 2025-01-31
tianleiwu
tianleiwu commented on 2025-02-02
tianleiwu
tianleiwu commented on 2025-02-02
tianleiwu
tianleiwu commented on 2025-02-02
tianleiwu
tianleiwu commented on 2025-02-02
tianleiwu
tianleiwu commented on 2025-02-03
tianleiwu
tianleiwu commented on 2025-02-04
shahidkhuram
shahidkhuram commented on 2025-02-05
shahidkhuram
shahidkhuram commented on 2025-02-05
kunal-vaishnavi Fix bug in FP32 CPU jump times model
906023d6
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
3ed4bf26
kunal-vaishnavi Add changes from PR feedback
fae3dd82
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
3c84fd46
kunal-vaishnavi Compare token ids outputs of various shapes
f3003fba
kunal-vaishnavi Fix MHA unit test failures
f8c04fe5
kunal-vaishnavi Fix Whisper fusion tests
52e0fd0a
kunal-vaishnavi Remove debugging code line
78a07876
kunal-vaishnavi Fix more CI unit tests
4f68e404
github-advanced-security
github-advanced-security commented on 2025-03-12
kunal-vaishnavi Fix CI build errors
33183a7e
kunal-vaishnavi Fix more CI build errors
bd38ccc6
kunal-vaishnavi Add ninja to docker image and update docs
65b1739a
kunal-vaishnavi Fix typo with package name
53a470c9
kunal-vaishnavi Upgrade to CUDA 12.1 in CIs
130626fd
kunal-vaishnavi Attempt to upgrade to CUDA 12.4
9e20aea1
kunal-vaishnavi Revert back to CUDA 11.8 in CIs
f6eabd45
kunal-vaishnavi Fix typo in TRT version when reverting
e443d70d
microsoft microsoft deleted a comment from azure-pipelines on 2025-03-14
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
ab966834
microsoft microsoft deleted a comment from azure-pipelines on 2025-03-14
tianleiwu
tianleiwu commented on 2025-03-14
tianleiwu
tianleiwu commented on 2025-03-14
tianleiwu
tianleiwu commented on 2025-03-14
tianleiwu
tianleiwu commented on 2025-03-14
tianleiwu
tianleiwu commented on 2025-03-14
tianleiwu
tianleiwu commented on 2025-03-14
tianleiwu
tianleiwu commented on 2025-03-14
kunal-vaishnavi Add changes based on PR feedback
11a69fcc
kunal-vaishnavi Merge branch 'main' into kvaishnavi/whisper-separate-export
f04bd0bc
kunal-vaishnavi Rename from FT causal attention to decoder attention
0adafe70
kunal-vaishnavi Fix Python linter error
460e7e02
tianleiwu
tianleiwu commented on 2025-03-14
tianleiwu
tianleiwu commented on 2025-03-14
kunal-vaishnavi Update buffer sharing definition
3ed3a47f
kunal-vaishnavi Update MHA op spec
09d9fef6
kunal-vaishnavi Update MHA op spec again
fb18f804
kunal-vaishnavi Update wording in MHA op spec details
f6aee5f6
kunal-vaishnavi Fix typo in wording
29396f15
kunal-vaishnavi Remove unnecessary commas in op spec
1748624a
kunal-vaishnavi Update docs after op spec changes
edf30d0a
tianleiwu
tianleiwu approved these changes on 2025-03-15
kunal-vaishnavi kunal-vaishnavi merged 7942fa7a into main 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone