onnxruntime
Allow FP16 math in flash attention
#24953
Merged

Allow FP16 math in flash attention #24953

sushraja-msft merged 3 commits into main from user/sushraja/fp16_fa
sushraja-msft
sushraja-msft Return back to fp16 fa
7e1d816b
sushraja-msft sushraja-msft requested a review from guschmue guschmue 1 year ago
sushraja-msft sushraja-msft requested a review from qjia7 qjia7 1 year ago
guschmue guschmue added ep:WebGPU
fs-eire
fs-eire commented on 2025-06-04
sushraja-msft Make the min_value precision dependent
63f422b2
github-actions
github-actions commented on 2025-06-04
sushraja-msft Update onnxruntime/contrib_ops/webgpu/bert/flash_attention.cc
18cd745a
qjia7
qjia7 approved these changes on 2025-06-05
fs-eire
fs-eire approved these changes on 2025-06-05
guschmue
guschmue
guschmue approved these changes on 2025-06-05
sushraja-msft sushraja-msft merged 1c577b71 into main 363 days ago
sushraja-msft sushraja-msft deleted the user/sushraja/fp16_fa branch 363 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone