onnxruntime
[WebNN EP] Support GroupQueryAttention(GQA)
#23416
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
15
Changes
View On
GitHub
Commits
simple implementation for GQA
peishenyan
committed
263 days ago
add input and output cast when fp16
peishenyan
committed
263 days ago
add comments
peishenyan
committed
263 days ago
add support for group query
peishenyan
committed
263 days ago
format code
peishenyan
committed
263 days ago
fix kv_num_heads bugs
peishenyan
committed
263 days ago
fix wrong variable name
peishenyan
committed
263 days ago
fix bugs
peishenyan
committed
263 days ago
fix reshape bugs
peishenyan
committed
263 days ago
skip total_sequence_length input for GQA op
peishenyan
committed
263 days ago
temp
peishenyan
committed
263 days ago
address comments and improve shape inference for GQA
peishenyan
committed
263 days ago
add constant creator for given array
peishenyan
committed
263 days ago
address comments
peishenyan
committed
263 days ago
update matMulNBits_op_builder.cc and remove unused header file
peishenyan
committed
263 days ago
Loading