feat: Index the (text) datasets contents to enable full-text search - DuckDB (#1296)
* Draft files
* Adding duckdb index job runner
* Fix style
* WIP adding fts on API
* Remove non used code
* Fix style
* Adding chart objects
* Rollback dependency in API
* Depend on parquet an split
* Fix libcommon test
* Send index file to dedicated branch
* Fix test in first parquet
* Fix merge hanges
* Fix poetry files
* Adding happy path test
* Adding other test scenarios
* Adding chart configuration
* Apply suggestions from code review
Co-authored-by: Sylvain Lesage <sylvain.lesage@huggingface.co>
* Change ParquetFileItem to SplitHubFile
* Inherit from SplitCachedJobRunner
* Fix style
* Depends on info featues instead of parquet schema
* Fix libcommon test
* Apply code review suggestions
* Some details
* Fix style
* Fix test
* Apply code review suggestions
* Update chart/values.yaml
Co-authored-by: Sylvain Lesage <sylvain.lesage@huggingface.co>
* Apply suggestions from code review
Co-authored-by: Sylvain Lesage <sylvain.lesage@huggingface.co>
* Apply code review suggestions
* [docs] Improvements (#1376)
* add end-to-end example
* apply feedback
* Fix closing brackets and GH action link (#1389)
* Fix typo in erro rmessage (#1391)
* Add docker internal to extra_hosts (#1390)
* fix: 🐛 support bigger images (#1387)
* fix: 🐛 support bigger images
fixes https://github.com/huggingface/datasets-server/issues/1361
* style: 💄 fix style
* style: 💄 add types for Pillow
* Rename dev to staging, and use staging mongodb cluster (#1383)
* chore: 🤖 remove makefile targets
since we use ArgoCD now
* feat: 🎸 align dev on prod, and use secret for mongo url
* feat: 🎸 rename dev to staging
* ci: 🎡 change dev to staging in ci
* feat: 🎸 10x the size of supported images (#1392)
* Fix exception
* Fix test in libcommon
* Apply some code review suggestions
* Apply code review suggestions
* Adding close connection
* Upgrade duckdb version
* Apply code review suggestions
* Fix style
* Adding some test cases
* Remove duplicate code by merge
* Fix imports
* Apply code review suggestions
* Apply suggestions from code review
Co-authored-by: Sylvain Lesage <sylvain.lesage@huggingface.co>
* Add test
---------
Co-authored-by: Sylvain Lesage <sylvain.lesage@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Bas Krahmer <baskrahmer@gmail.com>
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>