llama.cpp
3dc7397a - CANN: fix RoPE cache issue on multi-device (#15629)

Commit
6 days ago
CANN: fix RoPE cache issue on multi-device (#15629) * CANN: fix RoPE cache issue on multi-device RoPE cache only needs to be computed once per token. However, in multi-device scenarios, not every device starts computation from layer 0, which may lead to unallocated memory issues and precision errors. This commit records the first layer of each device to avoid the above issues. * CANN: Optimize first-layer detection method * CANN: Remove trailing whitespace * CANN: Only cache the data that can be determined as unchanged through the parameters. * CANN: Update function comment
Author
Parents
Loading