openvino
3ac45ad5 - [CPU] Implement X-Attention for intel CPU (#32086)

Commit

183 days ago

[CPU] Implement X-Attention for intel CPU (#32086) ### Details: - *Implement X-Attention which is run in pre-inference stage before PagedAttention in attention pipeline to generate sparse attention blocks to accelerate long prompt inference* ### Tickets: - *CVS-171072* --------- Co-authored-by: liubo-intel <bo4.liu@intel.com>

References

#32086 - [CPU] Implement X-Attention for intel CPU

Author

mangguo321

Parents

3d46a4b3

openvino 3ac45ad5 - [CPU] Implement X-Attention for intel CPU (#32086)

openvino
3ac45ad5 - [CPU] Implement X-Attention for intel CPU (#32086)