OCR: Tesseract 4.1.1 / Ghostscript 9.54.0
With tesseract v4.0.0-beta.3 we often observe crashes with: ``` contains_unichar_id(unichar_id):Error:Assert failed:in file ../../src/ccutil/unicharset.h, line 511 ``` This seems to have been fixed by https://github.com/tesseract-ocr/tesseract/pull/1954 Still, even after updating to 4.1.1, text recognition from PDF in ERP5 is too expensive. We also update Ghostscript to 9.54.0, because this version has built-in OCR, which does not need to convert the PDF to PNG then TIFF as we currently do in ERP5. See merge request !985
Showing
Please register or sign in to comment