unstructured
66640f26 - fix: xml processing not escaped (#4034)

Commit
231 days ago
fix: xml processing not escaped (#4034) `<?xml version="1.0"?>` does not get escaped when converting to html, in a code block like this in the markdown file ```` <?xml version="1.0"?> <sparql xmlns="http://www.w3.org/2005/sparql-results#"> <head></head> <boolean>true</boolean> </sparql> ```` which causes the parser to throw error like > AttributeError: 'lxml.etree._ProcessingInstruction' object has no attribute 'is_phrasing'. This PR processes the original md file and add indentation to `<?xml version="1.0"?>` to force the xml code to be escaped when being converted to html https://github.com/Unstructured-IO/unstructured/issues/3935
Parents
Loading