community[patch]: Force opt-in for WebResearchRetriever (CVE-2024-3095) (#24451)
This PR addresses the issue raised by (CVE-2024-3095)
https://huntr.com/bounties/e62d4895-2901-405b-9559-38276b6a5273
Unfortunately, we didn't do a good job writing the initial report. It's
pointing at both the wrong package and the wrong code.
The affected code is the Web Retriever not the AsyncHTMLLoader, and the
WebRetriever lives in langchain-community
The vulnerable code lives here:
https://github.com/langchain-ai/langchain/blob/0bd3f4e1292c085f22bef1fff16059851e11d042/libs/community/langchain_community/retrievers/web_research.py#L233-L233
This PR adds a forced opt-in for users to make sure they are aware of
the risk and can mitigate by configuring a proxy:
https://github.com/langchain-ai/langchain/blob/0bd3f4e1292c085f22bef1fff16059851e11d042/libs/community/langchain_community/retrievers/web_research.py#L84-L84