onnxruntime
[WebNN EP] Support GroupQueryAttention(GQA)
#23416
Merged

Commits
  • simple implementation for GQA
    peishenyan committed 263 days ago
  • add input and output cast when fp16
    peishenyan committed 263 days ago
  • add comments
    peishenyan committed 263 days ago
  • add support for group query
    peishenyan committed 263 days ago
  • format code
    peishenyan committed 263 days ago
  • fix kv_num_heads bugs
    peishenyan committed 263 days ago
  • fix wrong variable name
    peishenyan committed 263 days ago
  • fix bugs
    peishenyan committed 263 days ago
  • fix reshape bugs
    peishenyan committed 263 days ago
  • skip total_sequence_length input for GQA op
    peishenyan committed 263 days ago
  • temp
    peishenyan committed 263 days ago
  • address comments and improve shape inference for GQA
    peishenyan committed 263 days ago
  • add constant creator for given array
    peishenyan committed 263 days ago
  • address comments
    peishenyan committed 263 days ago
  • update matMulNBits_op_builder.cc and remove unused header file
    peishenyan committed 263 days ago
Loading