Web Search: Playwright, spatial parsing, markdown (#1094)
* feat: playwright, spatial parsing, markdown for web search
Co-authored-by: Aaditya Sahay <aadityasahay1@gmail.com>
* feat: choose multiple clusters if necessary (#2)
* chore: resolve linting failures
* feat: improve paring performance and error messages
* feat: combine embeddable chunks together on cpu
* feat: reduce parsed pages from 10 to 8
* feat: disable javascript in playwright by default
* feat: embedding and parsing error messages
* feat: move isURL, fix type errors, misc
* feat: misc cleanup
* feat: change serializedHtmlElement to interface
* fix: isUrl filename
* fix: add playwright dependencies to docker
* feat: add playwright browsers to docker image
* feat: enable javascript by default
* feat: remove error message from console on failed page
---------
Co-authored-by: Aaditya Sahay <aadityasahay1@gmail.com>
Co-authored-by: Aaditya Sahay <56438732+Aaditya-Sahay@users.noreply.github.com>