text-generation-inference
5cd8025f - hotfix: fix regression of attention api change in intel platform (#2439)

Commit
1 year ago
hotfix: fix regression of attention api change in intel platform (#2439) fix regression caused by attention api change. ipex.varlen_attention does not support paged-cache format kv input now. Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Author
Parents
Loading