feat: custom fallback for language detection (#4238)
Closes #4091
Implements custom fallback for language detection so short text is not
forced to English and callers can control or disable detection.
## Changes:
- language_fallback
Optional callable used when text is short (<5 words) and ASCII. It
receives the text and can return a list of ISO 639-3 codes or None to
leave language unspecified. If not provided, short text still defaults
to ["eng"] (backward compatible).
- detect_languages() / apply_lang_metadata()
New parameter language_fallback; applied in the short-text path only.
- partition() (auto)
New parameter language_fallback; passed through to all partitioners via
the metadata decorator.
- partition_md()
New parameter languages so callers can pass languages=[""] to disable
language detection (aligned with other partitioners).
## Usage:
- Return None for short text: partition(..., language_fallback=lambda
text: None)
- Custom short-text language: partition(...,
language_fallback=my_detector)
- Disable detection: partition_md(..., languages=[""]) or partition(...,
languages=[""])