onnxruntime
QAttention calls into MatMulIntToFloat instead of Dequantize+GEMM
#16851
Merged

QAttention calls into MatMulIntToFloat instead of Dequantize+GEMM #16851

zhangxiang1993
PatriceVignola
PatriceVignola commented on 2023-07-25
PatriceVignola
PatriceVignola commented on 2023-07-25
PatriceVignola
PatriceVignola commented on 2023-07-25
PatriceVignola
PatriceVignola commented on 2023-07-25
PatriceVignola
PatriceVignola commented on 2023-07-25
PatriceVignola
PatriceVignola commented on 2023-07-25
jeffbloo jeffbloo force-pushed the DmlPrototype branch from eb6222b2 to 0790b051 2 years ago
jeffbloo jeffbloo requested a review 2 years ago
jeffbloo jeffbloo requested a review 2 years ago
jeffbloo jeffbloo requested a review 2 years ago
jeffbloo jeffbloo requested a review 2 years ago
zhangxiang1993 QAttention calls into MatMulIntToFloat instead of Dequantize+GEMM
23ad7917
zhangxiang1993 rebase DmlPrototype
98bd750b
zhangxiang1993 zhangxiang1993 force pushed from 56199c76 to 98bd750b 2 years ago
zhangxiang1993 consistent style
f2aff002
PatriceVignola
PatriceVignola approved these changes on 2023-07-26
zhangxiang1993 zhangxiang1993 merged cbdd0bb7 into DmlPrototype 2 years ago
zhangxiang1993 zhangxiang1993 deleted the user/xianz/QAttention_v2 branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone