onnxruntime
[JS/WebGPU] Improve MatMulNBits perf
#19974
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
58
Changes
View On
GitHub
Commits
Improve perf
satyajandhyala
committed
2 years ago
Fix lint error.
satyajandhyala
committed
2 years ago
Format
satyajandhyala
committed
2 years ago
Changes to make any combinations of components to work.
satyajandhyala
committed
2 years ago
Perform blockwise matmul
satyajandhyala
committed
2 years ago
format
satyajandhyala
committed
2 years ago
Fixed some errors.
satyajandhyala
committed
2 years ago
Added workgroupSize and dispatchGroup.
satyajandhyala
committed
2 years ago
Use bit operations instead of multiplications and divisions
satyajandhyala
committed
2 years ago
Added maxComputeWorkgroupSizes function to get retrieve workgroup size limits.
satyajandhyala
committed
2 years ago
Added batch dim
satyajandhyala
committed
2 years ago
Added batch support
satyajandhyala
committed
2 years ago
Removed separate reduce step.
satyajandhyala
committed
2 years ago
minor fix
satyajandhyala
committed
2 years ago
WIP: adding components.
satyajandhyala
committed
2 years ago
Format
satyajandhyala
committed
2 years ago
Added outputNumber back.
satyajandhyala
committed
2 years ago
Only the leading shader in the workgroup needs to write outut.
satyajandhyala
committed
2 years ago
Prefetch necessary input tensor data
satyajandhyala
committed
2 years ago
Unroll innermost loops to reduce loop overhead
satyajandhyala
committed
2 years ago
Removed functional call overhead.
satyajandhyala
committed
2 years ago
Added getMaxWorkgroupStorageSize
satyajandhyala
committed
2 years ago
Compute workgroupSizeX as multiple of nBlocksPerCol
satyajandhyala
committed
2 years ago
Removed unused uniforms.
satyajandhyala
committed
2 years ago
Removed outputNumber
satyajandhyala
committed
2 years ago
Removed block_size variable
satyajandhyala
committed
2 years ago
Choose components based on memory availability and produced fatal error
satyajandhyala
committed
2 years ago
Reroll the last loop nest
satyajandhyala
committed
2 years ago
Added fallback option to blockwise matmulnbits
satyajandhyala
committed
2 years ago
Removed unused variable.
satyajandhyala
committed
2 years ago
typo
satyajandhyala
committed
1 year ago
Temporary commmit
satyajandhyala
committed
1 year ago
Code optimization and clean up.
satyajandhyala
committed
1 year ago
Modified getMaxComponents to accept arbitrary number of arguments.
satyajandhyala
committed
1 year ago
Added rectangular output testcases.
satyajandhyala
committed
1 year ago
Prefer using BlockwiseMatMulNBits.
satyajandhyala
committed
1 year ago
Removed workgroup shared memory initialization to 0.
satyajandhyala
committed
1 year ago
Performace tuning
satyajandhyala
committed
1 year ago
Removed pre-fetching input data.
satyajandhyala
committed
1 year ago
Re-roll the for loops.
satyajandhyala
committed
1 year ago
Prefer additions over multiplications.
satyajandhyala
committed
1 year ago
Fixed hint for the fallback
satyajandhyala
committed
1 year ago
Use unpack4xU8
satyajandhyala
committed
1 year ago
Load 8 element of input at a time
satyajandhyala
committed
1 year ago
Fixed zero_point offset calculation.
satyajandhyala
committed
1 year ago
Use near multiple of 4 when calculating components.
satyajandhyala
committed
1 year ago
Deal with odd numbers.
satyajandhyala
committed
1 year ago
Renamed variable row and col instead of m and n
satyajandhyala
committed
1 year ago
Added processOneBlock to refactor code.
satyajandhyala
committed
1 year ago
Added bBlocksPerCol and blobSize to attributes to avoid recalculating.
satyajandhyala
committed
1 year ago
Added missing semicolon
satyajandhyala
committed
1 year ago
Simplified component calculation
satyajandhyala
committed
1 year ago
Cleaned-up uniforms
satyajandhyala
committed
1 year ago
Removed backup file added by mistake
satyajandhyala
committed
1 year ago
minor change
satyajandhyala
committed
1 year ago
Revert "Added bBlocksPerCol and blobSize to attributes to avoid recalculating."
satyajandhyala
committed
1 year ago
Reverted changes to getMaxComponents.
satyajandhyala
committed
1 year ago
Format
satyajandhyala
committed
1 year ago
Loading