Fix QuestionListOutputParser (#9738)
This PR fixes `QuestionListOutputParser` text splitting.
`QuestionListOutputParser` incorrectly splits numbered list text into
lines. If text doesn't end with `\n` , the regex doesn't capture the
last item. So it always returns `n - 1` items, and
`WebResearchRetriever.llm_chain` generates less queries than requested
in the search prompt.
How to reproduce:
```python
from langchain.retrievers.web_research import QuestionListOutputParser
parser = QuestionListOutputParser()
good = parser.parse(
"""1. This is line one.
2. This is line two.
""" # <-- !
)
bad = parser.parse(
"""1. This is line one.
2. This is line two.""" # <-- No new line.
)
assert good.lines == ['1. This is line one.\n', '2. This is line two.\n'], good.lines
assert bad.lines == ['1. This is line one.\n', '2. This is line two.'], bad.lines
```
NOTE: Last item will not contain a line break but this seems ok because
the items are stripped in the
`WebResearchRetriever.clean_search_query()`.