Whisper Model Optimization #15473
Work in progress
bc953272
Work in progress 2
d95d4a3a
Work in progress 3
dc0c9182
Work in progress 4
aa363902
Work in progress 5
65ea4369
Work in progress 6
c3b2564b
Work in progress 7
8c309832
Work in progress 8
b3d1e261
Work in progress 9
2a243762
Work in progress 10
8aea1da5
Work in progress 11
5de03318
Work in progress 12
dedd007e
Work in progress 13
a7bff6b0
Cleaning up comments
2fa2201b
Cleaning up more comments
53416bf4
Merge branch 'main' into dev
ea23e017
Merge branch 'microsoft:main' into dev
bf7f23fb
Fixing few issues after merging with main
7e7b19f1
Fix multihead attention flag
4bf560a2
Changing attention fusion in decoder with past to multihead attention…
5ef69a53
Fix separating present KV into present K and present V
09235baf
Adding test cases, fusion changes, and kernel changes
911768c6
Removing commented out code
f8389eb5
Remove QKV format assert
96e061c8
Remove condition for memory efficient attention
c106f32c
Adding onnx test files
b2f3d990
Merge branch 'main' into dev
406a5d97
Add ORT return if error
4003653d
Fix allocator naming and casting
d1aaa560
Fix casting and remove extra parameter
9b341cf7
Fix CodeQL scan errors and convert value to float
ee32f88f
Fix test cases
7d36cae5
Fix more test cases
33299d12
Add whisper folder to build
0e5d42cc
Adding format changes suggested by linter
8c2b2a44
Remove extra parenthesis
2b002abe
Adding more format changes suggested by linter
03884306
Adding space and comma suggestions from linter
2a94bdf8
Fix allocator initialization
97aaedb3
Remove commented out line
dbda09d8
Merge branch 'main' into kvaishnavi/whisper
b65e6687
Remove packed qkv and simplify calculating present kv
52e34c85
Add changes suggested by new linter
a75c1214
Merge branch 'main' into kvaishnavi/whisper
70eab061
Add changes suggested by new C++ linter
6868e940
Merge branch 'main' into kvaishnavi/whisper
bc17d24d
tianleiwu
approved these changes
on 2023-04-19
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub