flax
1c27546e - Fix promotion bug in MultiHeadDotProductAttention:

Commit
2 years ago
Fix promotion bug in MultiHeadDotProductAttention: * if x64 is enabled, autoregressive decoding would lead to dynamic slicing errors * Add explicit types for dynamic_slice arguments * Add test for x64 autoregressive decoding Amended for more idiomatic/correct test case
Author
Committer
Parents
Loading