-
Jérome Perrin authored
By default, tesseract runs on 4 CPU and this can be controlled by OMP_THREAD_LIMIT=1 to run on only one CPU (as documented on https://tesseract-ocr.github.io/tessdoc/FAQ.html) In ERP5, we tend to use one zope node per CPU, so we don't want each of these zope nodes to spawn a process which will run on 4 CPU. In a quick benchmark it's not slower, even a bit faster to disable threads: ## a big image in france (a picture of an invoice) $ time ./bin/tesseract /tmp/input.tiff /tmp/out.txt Tesseract Open Source OCR Engine v4.1.1 with Leptonica Page 1 Error in pixClipBoxToForeground: box not within image Error in pixClipBoxToForeground: box not within image ________________________________________________________ Executed in 14.41 secs fish external usr time 27.88 secs 1002.00 micros 27.88 secs sys time 0.74 secs 0.00 micros 0.74 secs $ time OMP_THREAD_LIMIT=1 ./bin/tesseract /tmp/input.tiff /tmp/out.txt Tesseract Open Source OCR Engine v4.1.1 with Leptonica Page 1 Error in pixClipBoxToForeground: box not within image Error in pixClipBoxToForeground: box not within image ________________________________________________________ Executed in 12.58 secs fish external usr time 11.84 secs 955.00 micros 11.84 secs sys time 0.52 secs 503.00 micros 0.52 secs ## a small japanese image $ time ./tesseract -l jpn+eng /tmp/inputjp.tiff /tmp/out.txt Tesseract Open Source OCR Engine v4.1.1 with Leptonica Page 1 ________________________________________________________ Executed in 2.16 secs fish external usr time 3.77 secs 590.00 micros 3.77 secs sys time 0.27 secs 209.00 micros 0.27 secs $ time OMP_THREAD_LIMIT=1 ./tesseract -l jpn+eng /tmp/inputjp.tiff /tmp/out.txt Tesseract Open Source OCR Engine v4.1.1 with Leptonica Page 1 ________________________________________________________ Executed in 2.02 secs fish external usr time 1766.07 millis 1437.00 micros 1764.63 millis sys time 214.06 millis 522.00 micros 213.54 millis
d74981c3