www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2020/05/18/00:45:40

X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9563B388F05D
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1589777085;
bh=lV+B5YKNCuxti82AYJXGIDSig6ZrJPdSIeSGgI4QNFo=;
h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
List-Help:List-Subscribe:From:Reply-To:Cc:From;
b=PBX2o4N7uISm9wUw4ZvTFA3az4nsP9877snqB8NXQ4jdODTfvm8Fz0/fvFXPjFTdI
cVFqu3l4MjwHHHRRmrYdqx4DvHYZibPy39eGIdSqk14EGg43j2E4adrkT/6T04mmUo
OoW04rRuay8ewRTgJlovyAJ7f5Kw4RckKw4OyrSI=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 01F30386F02B
DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.jhmg.net 464D2403DE
Subject: Volunteer testers for OCRmyPDF install instructions under Cygwin?
To: cygwin AT cygwin DOT com
Autocrypt: addr=jhg AT jhmg DOT net; prefer-encrypt=mutual; keydata=
mQGiBDWEWocRBADfF9Q6lhkW9USReZ96cBC93kq3bblkNslVAZzm9itW7sAEzHbydIZ9hZjm
e93UxUPzg1zGXX9xrdQy0+lHxkj2wvzgEF50Kqjft6KAd8AqiNmcbu5Q+/SHIP87C/tD/wWO
TX7I99ekggy+5a6illN/s7MhuPIsMtt3ofFFcuOvswCg/08V11KALulG6u9j1affyHy20UMD
/A1MRT3YZt6NJE9XbcalVLQzWc+ArCkW0oxNs/wrQ26lYoWuj20nusq9MDkuOL1h1FxeUrgx
kKP+1zyYaQkB2lbJyvGvIpXgxY4vUnOXwMovTcRST3bWOOSIiYVOzKWJh5fPtoEaU5wFZ6yU
lu/QGoS8Lt9QOI/XjjRaJjf0T6rKBADTn4xcxNIQNWSxJthmH3ipn39+sizwkZHfmAVHUf6w
f4cDJ8mA3jl0RWKTnxj+5zEY32VduewHtNUtgwugXaIlLM/ErO+tzxQ4R6QysucgxmJBUvw8
uDgUAKv8HQFviEGeUpQSoZLKoqxk3udT+9UEDHdUFZzUw6cb7nBL5RR05rQfSmFtZXMgSC4g
R2Fycmlzb24gPGpoZ0BhY20ub3JnPohOBBARAgAOBAsDAgEFAk9QBRECGQEACgkQKW78YnBz
jYiN1QCgsJYtE2vUORbwWAqC/DMqYGSjMWAAoIFomnf2gp9zrl5pMv9gD1gTEGEPuQINBDWE
WocQCAD2Qle3CH8IF3KiutapQvMF6PlTETlPtvFuuUs4INoBp1ajFOmPQFXz0AfGy0OplK33
TGSGSfgMg71l6RfUodNQ+PVZX9x2Uk89PY3bzpnhV5JZzf24rnRPxfx2vIPFRzBhznzJZv8V
+bv9kV7HAarTW56NoKVyOtQa8L9GAFgr5fSI/VhOSdvNILSd5JEHNmszbDgNRR0PfIizHHxb
LY7288kjwEPwpVsYjY67VYy4XTjTNP18F1dDox0YbN4zISy1Kv884bEpQBgRjXyEpwpy1obE
AxnIByl6ypUM2Zafq9AKUJsCRtMIPWakXUGfnHy9iUsiGSa6q6Jew1XpMgs7AAICB/9P0SzY
Lt1xjTmFGwf+uEYL6ymfMeeGVQMl53vm38kxAzYpAPEuk/6pJQHzQkeAYI55rhgqomZacGtT
W4p0JzX2rLzunltzpDGiqkqu3ZLFrKpKkadZCWN6qVUhE8LaObZBuppZNm1CnIPB+RNucYGe
Sn60mia08EBO+IzlLmOJBkopMME3vTzTsnvmECchEoPov5A9tXMW3TJpLQtSyiXMGs8TalHb
by40WOPvPkyCrWVrYCEoUz8wgz2L5ZzmPcwQQVTfzpxFIb5HINAspyHqP5KBtfrYF05DEAXg
RZEoh9T3HDtzMLwAgxFN0BzVXIwgYTtqwPsTBTqJHNwQZ0BTiEYEGBECAAYFAjWEWocACgkQ
KW78YnBzjYi0zgCgv6RuSo28x1TBIbEQJgAwAV6DPdMAnjC3YrzFCHHmI+4tNkU/JmgLy+t3
Message-ID: <d8f6cb30-ebeb-fcbe-d15e-b34702a65359@jhmg.net>
Date: Sun, 17 May 2020 21:44:41 -0700
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101
Firefox/68.0 Thunderbird/68.8.0
MIME-Version: 1.0
X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00, DKIM_SIGNED,
DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_PASS, SPF_PASS,
TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <http://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Jim Garrison via Cygwin <cygwin AT cygwin DOT com>
Reply-To: jhg AT acm DOT org
Cc: Jim Garrison <jhg AT jhmg DOT net>
Sender: "Cygwin" <cygwin-bounces AT cygwin DOT com>

OCFmyPDF is a command-line utility that will take image-only PDFs,
perform OCR and add a text layer to the PDF, allowing it to be
searched.  It is written in Python and C++, and on Linux is installed
via the Python 'pip' installer.

I tried installing it under Cygwin64 but ran into a compiler error
while building a dependency, pikepdf.  This turned out to be fixable
by a single CFLAGS change (from -std=c++14 to -std=gnu++14), which the
maintainer of pikepdf (and OCRmyPDF) graciously fast-tracked.

The instructions for installing under Cygwin are:

1. Install the following Cygwin packages:

        python36 (or later)
        python3?-devel
        python3?-pip
        python3?-lxml

     (where 3? means match the version of python3 you installed)

        gcc-g++
        ghostscript
        libexempi3
        libexempi-devel
        libffi6
        libffi-devel
        pngquant
        qpdf
        libqpdf-devel
        tesseract-ocr
        tesseract-ocr-devel

2. In a terminal, run the following commands

        pip3 install wheel
        pip3 install ocrmypdf

    Note: You may get a warning about the version of pip that came
    with Cygwin being out of date.  It is not required, but if you want
    you can update pip to the latest version with

        pip3 install --upgrade pip

    But note that if you do this the command name will now be just
    'pip' instead of 'pip3'.

There is one optional dependency, "unpaper" that is currently not
available under Cygwin. Without it, certain options such as --clean
will produce an error message.  However, the OCR-to-text-layer
functionality is available.  I'll take a look at building a Cygwin
version of unpaper.

I've tried this in a clean, minimal Cygwin install but would like to
get confirmation from a few other people before submitting this to the
OCRmyPDF maintainer for inclusion in their install instructions.

Is there anyone with interest in OCRmyPDF willing to try these
instructions and report back?  Off-list is fine if that would be off-
topic here.

Thanks

--
Jim Garrison jhg AT acm DOT org
--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019