langchain
71e8eaff - UnstructuredURLLoader: allow url failures, keep processing (#1954)

Commit

3 years ago

UnstructuredURLLoader: allow url failures, keep processing (#1954) By default, UnstructuredURLLoader now continues processing remaining `urls` if encountering an error for a particular url. If failure of the entire loader is desired as was previously the case, use `continue_on_failure=False`. E.g., this fails splendidly, courtesy of the 2nd url: ``` from langchain.document_loaders import UnstructuredURLLoader urls = [ "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023", "https://doesnotexistithinkprobablynotverynotlikely.io", "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023", ] loader = UnstructuredURLLoader(urls=urls, continue_on_failure=False) data = loader.load() ``` Issue: https://github.com/hwchase17/langchain/issues/1939

Author

cragwolfe

Parents

6598beac

langchain 71e8eaff - UnstructuredURLLoader: allow url failures, keep processing (#1954)

langchain
71e8eaff - UnstructuredURLLoader: allow url failures, keep processing (#1954)