transformers
24845aeb - Layoutlmv2 tesseractconfig (#17733)

Commit
3 years ago
Layoutlmv2 tesseractconfig (#17733) * Added option for users to modify config parameter used by pytesseract during feature extraction - Added optional 'tess_config' kwarg when setting up LayoutLMV2 processor that is used by pytesseract during feature extraction - Eg. Can be used to modify psm values by setting tess_config to '--psm 7' - Different psm values significantly influences the output of layoutlmv2 * Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated variable names to be more explicit * Fixed styles * Added option for users to modify config parameter when calling pytesseract during feature extraction - Added option to set "tesseract_config" parameter during LayoutLMV3 processor initialization - Can be used to modify PSM values, eg. by setting tesseract_config="--psm 6" * Removed from function signature Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Author
Parents
Loading