Commit 1b291415 authored by Jérome Perrin's avatar Jérome Perrin

component/ghostscript: Workaround for slaprunner paths with double slashs

This tessdata path will be included in cpp code by pre-processor macros
https://github.com/ArtifexSoftware/ghostpdl/blob/gs9.54.0/base/tessocr.cpp#L188-L193
Since // is the marker for a comment in cpp and as documented in
https://gcc.gnu.org/onlinedocs/cpp/Stringizing.html "Comments are replaced by
whitespace long before stringizing happens, so they never appear in stringized
text", the STRINGIFY/STRINGIFY2 approach of including a path does not work
when the path contain // , because anything after // is considered a comment
and is not included, causing errors like this when using ghostscript with OCR
in webrunner:

    $ strace -e open -o open.strace /srv/slapgrid/slappart42/srv/runner/shared/ghostscript/4387fe7a8d2034ac5691d43b58134248/bin/gs -sDEVICE=ocr
    GPL Ghostscript 9.54.0 (2021-03-30)
    Copyright (C) 2021 Artifex Software, Inc.  All rights reserved.
    This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
    see the file COPYING for details.
    Error opening data file ./eng.traineddata
    Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
    Failed loading language 'eng'
    Tesseract couldn't load any languages!
    **** Unable to open the initial device, quitting.
    $ grep eng open.strace
    open("./eng.traineddata", O_RDONLY)     = -1 ENOENT (No such file or directory)
    open("/srv/slapgrid/slappart42/srv/eng.traineddata", O_RDONLY) = -1 ENOENT (No such file or directory)
    open("eng.traineddata", O_RDONLY)       = -1 ENOENT (No such file or directory)

eng.traineddata is looked up in /srv/slapgrid/slappart42/srv/ because
ghostscript was configured with:

   --with-tessdata=/srv/slapgrid/slappart42/srv//runner//shared/ghostscript/4387fe7a8d2034ac5691d43b58134248/share/tessdata/

and everything after // was stripped.

This was reported upstream as https://bugs.ghostscript.com/show_bug.cgi?id=703905

More about the case of // in slaprunner paths was on commit eb544196
(slparunner: document the reasons why we keep srv//slaprunner, 2019-10-10)
parent b137c0ca
...@@ -17,6 +17,7 @@ shared = true ...@@ -17,6 +17,7 @@ shared = true
url = https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs9540/ghostscript-9.54.0.tar.gz url = https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs9540/ghostscript-9.54.0.tar.gz
md5sum = 5d571792a8eb826c9f618fb69918d9fc md5sum = 5d571792a8eb826c9f618fb69918d9fc
pkg_config_depends = ${libtiff:location}/lib/pkgconfig:${libjpeg:location}/lib/pkgconfig:${fontconfig:location}/lib/pkgconfig:${fontconfig:pkg_config_depends} pkg_config_depends = ${libtiff:location}/lib/pkgconfig:${libjpeg:location}/lib/pkgconfig:${fontconfig:location}/lib/pkgconfig:${fontconfig:pkg_config_depends}
# XXX --with-tessdata work arounds a slaprunner bug of having softwares installed in a path containing //
configure-options = configure-options =
--disable-cups --disable-cups
--disable-threadsafe --disable-threadsafe
...@@ -24,7 +25,7 @@ configure-options = ...@@ -24,7 +25,7 @@ configure-options =
--without-libidn --without-libidn
--without-x --without-x
--with-drivers=FILES --with-drivers=FILES
--with-tessdata=${:tessdata-location} --with-tessdata=$(python -c 'print("""${:tessdata-location}""".replace("//", "/"))')
environment = environment =
PATH=${pkgconfig:location}/bin:${xz-utils:location}/bin:%(PATH)s PATH=${pkgconfig:location}/bin:${xz-utils:location}/bin:%(PATH)s
PKG_CONFIG_PATH=${:pkg_config_depends} PKG_CONFIG_PATH=${:pkg_config_depends}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment