[WIP] Allow for attention caching during CoCa generation #502
initial caching for generation
ecb93d95
remove timing
8ad92401
fix beamsearch caching
71d01f52
WIP Setup base for text encoder caching
09e3fec8
Merge branch 'mlfoundations:main' into inference_caching
df85d0c4
Fix transformer caching default
2c961e6d
avoid passing cache when not necessary
69a936ae
simplify caching argument
8dc14ed8
sramshetty
marked this pull request as draft 3 years ago
fix transformer cache typing for true branch
5ed4cf21
reorder typing to address list invariance
c0686dee
remove unnecessary placeholder lists
e561c8f6
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub