llama.cpp
9961d244 - CANN: Resolve soft_max precision issue (#15730)

Commit
120 days ago
CANN: Resolve soft_max precision issue (#15730) Previously, the slope tensor was set to fp16 to improve efficiency. While this worked correctly in FA, it caused precision issues in soft_max. This change applies different data types for different operators to balance both accuracy and performance.
Author
Parents
Loading