openvino
3ac45ad5 - [CPU] Implement X-Attention for intel CPU (#32086)

Commit
183 days ago
[CPU] Implement X-Attention for intel CPU (#32086) ### Details: - *Implement X-Attention which is run in pre-inference stage before PagedAttention in attention pipeline to generate sparse attention blocks to accelerate long prompt inference* ### Tickets: - *CVS-171072* --------- Co-authored-by: liubo-intel <bo4.liu@intel.com>
Author
Parents
Loading