Whisper Redesigned Solution #23549
Add support for creating optimized whisper ONNX models without beam s…
f3142875
Fix incorrect dynamic axes labels
6a44f72c
Fix fusion breaks for OpenAI implementation of Whisper
58ec5eb5
Merge branch 'main' into kvaishnavi/whisper-separate-export
4c228ea3
Merge branch 'main' into kvaishnavi/whisper-separate-export
dd20876a
Comment out DMMHA case temporarily
b13cb22f
Replace MHA with DMMHA
31db1a03
Merge branch 'main' into kvaishnavi/whisper-separate-export
3b924328
Debugging beam search output
7bb79f30
Initial commit for new export
14b7e77f
Add parity check after export and optimization
fa345fec
Fix multiple attention kernel invocations
e050dea4
Make output Q*K values optional
bf87062a
Fix batch size check for cache indirection
17fa0ab8
Save checkpoint for working solution
52aeb58e
Clean up code
240fe3b3
Fix string dumping
ae980850
Fix out_qk dtype issue for half input case.
3d2c8fe6
Remove type cast for output QK
287151ff
Enable release mode build
0805d1d2
Make QK output dtype independent of attention dtype
b6299035
Add batched jump times export
648b3899
Get batched jump times ONNX model with parity check
a6c6ee8c
Save checkpoint for working solution
c0a6ce45
Merge branch 'main' into kvaishnavi/whisper-separate-export
008eeb96
Fix build after merge
158d0a84
Fix model with beam search op
02cb5be7
Get model impl and beam search op export combinations working
2acd593c
Enable separate export of encoder and decoder init
612eb0c3
Add tests for multiple export types to CIs
f2d78fd7
Update folder and file names in Whisper README
cb935170
Add FP32 CPU DMMHA support
6da11ec8
Add unit tests
9640736c
Merge branch 'main' into kvaishnavi/whisper-separate-export
75a342ad
Change debug message for PrepareQkv
7fe6b05e
Fix seqlens_k after merge
86201689
Merge branch 'main' into kvaishnavi/whisper-separate-export
b0a732ba
Add changes suggested by linter
23808f73
Fix bug in FP32 CPU jump times model
906023d6
Merge branch 'main' into kvaishnavi/whisper-separate-export
3ed4bf26
Add changes from PR feedback
fae3dd82
Merge branch 'main' into kvaishnavi/whisper-separate-export
3c84fd46
Compare token ids outputs of various shapes
f3003fba
Fix MHA unit test failures
f8c04fe5
Fix Whisper fusion tests
52e0fd0a
Remove debugging code line
78a07876
Fix more CI unit tests
4f68e404
Fix CI build errors
33183a7e
Fix more CI build errors
bd38ccc6
Add ninja to docker image and update docs
65b1739a
Fix typo with package name
53a470c9
Upgrade to CUDA 12.1 in CIs
130626fd
Attempt to upgrade to CUDA 12.4
9e20aea1
Revert back to CUDA 11.8 in CIs
f6eabd45
Fix typo in TRT version when reverting
e443d70d
Merge branch 'main' into kvaishnavi/whisper-separate-export
ab966834
Add changes based on PR feedback
11a69fcc
Merge branch 'main' into kvaishnavi/whisper-separate-export
f04bd0bc
Rename from FT causal attention to decoder attention
0adafe70
Fix Python linter error
460e7e02
Update buffer sharing definition
3ed3a47f
Update MHA op spec
09d9fef6
Update MHA op spec again
fb18f804
Update wording in MHA op spec details
f6aee5f6
Fix typo in wording
29396f15
Remove unnecessary commas in op spec
1748624a
Update docs after op spec changes
edf30d0a
tianleiwu
approved these changes
on 2025-03-15
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub