llama.cpp
07ff0005 - CANN: add RoPE cache preload before ACL graph capture (#20747)

Commit
4 days ago
CANN: add RoPE cache preload before ACL graph capture (#20747) ACL graph capture disallows host-to-device memcpy and device memory malloc/free on the captured stream. Pre-load the RoPE cache before capture so that: - Host-to-device copies and allocations run on the non-captured stream - Cache metadata is populated and memory pool is warmed up - During capture, only on-device computations are recorded; host-side and allocation branches are skipped
Author
Parents
Loading