The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
looks neat from a (very) superficial glance
I think this will be quite useful!
(and yes we should remove the old ConversationalPipeline
sooner rather than later given it already doesn't work anymore due to conversational
pipeline-type being removed from the Hub, IIUC)
@julien-c Done! This PR now adds a DeprecationWarning
to ConversationalPipeline
. I also updated the chat template docs for the new pipeline.
very nice!
Nice! Thank for adding this π
235 | if isinstance(text_inputs[0], dict): | ||
236 | return super().__call__(Chat(text_inputs), **kwargs) | ||
237 | else: | ||
238 | chats = [Chat(chat) for chat in text_inputs] # π π π |
best comment π
One question for people, maybe @gante: Are you okay with the return format I'm using? Right now, if you pass a chat like this:
[
{"role": "system", "content": "This is a system message."},
{"role": "user", "content": "This is a test"},
]
You get a response that's the same chat, continued:
[
{"role": "system", "content": "This is a system message."},
{"role": "user", "content": "This is a test"},
{"role": "assistant", "content": "This is a reply"},
]
I think this is the right thing to do, because it matches the behaviour of the existing text-generation
pipeline (it returns the prompt at the start of the generated string). Let me know if you have a different opinion, though!
IMO it looks good to me
Cool!
In that case, I think we're ready for final review (cc @amyeroberts) - I'm leaving the KV cache to another PR.
cc @LysandreJik @julien-c as well if there's anything else you want me to add before we merge this!
Beautiful - thanks for adding this support!
216 | 230 | - **generated_token_ids** (`torch.Tensor` or `tf.Tensor`, present when `return_tensors=True`) -- The token | |
217 | 231 | ids of the generated text. | |
218 | 232 | """ | |
219 | return super().__call__(text_inputs, **kwargs) | ||
233 | if isinstance(text_inputs, (list, tuple)) and isinstance(text_inputs[0], (list, tuple, dict)): |
Just to make sure - is it not possible for someone to pass this to the pipeline:
# Pass a list-of-list-of-strings
generator([["this is a dog"], ["this is a code example"], ["banana for scale"]])
I tried that on main
- it just results in a TypeError: can only concatenate str (not "list") to str
. The existing pipeline will only accept either a single string or a non-nested list/tuple of strings, so I don't think this check makes a mistake for any valid inputs!
Login to write a write a comment.
This PR modifies the text generation pipeline to support chats. It does this by inspecting the inputs - if they look like strings, it uses the original causal LM pipeline, and if they look like lists of message dicts, it applies a chat template instead before proceeding with generation.
Most changes are in the preprocessing/postprocessing - the actual generation itself is largely unchanged.
TODO:
Add KV cache support, as this is important for performant multi-turn chatConversationalPipeline
and update the chat template docs to refer to this instead?cc @ArthurZucker @gante @LysandreJik