Use f32 scratch for output so we only need to transfer output with desired dtype back to HBM. #8924
Use f32 scratch for output so we only need to transfer output with de…
4e100747
not to use dynamic grid
e2cebd5b
linter
6a53e0a9
vanbasten23
marked this pull request as ready for review 1 year ago
bythew3i
approved these changes
on 2025-04-02
change torch_xla wrapper
9383cbc5
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub