Conversion

Paulo Ney de Souza pauloney at gmail.com
Fri May 6 17:55:47 CEST 2022


Dear David,

It would work better if you started from some other source that
generated the PDF, for example: tex, Word, InDesign, ...

You mention, towards the end of your short sentence, that the
paper includes a "source document". What is the format of this
source? That may be your best option.

But if you do NOT have the sources that produce it, one of the best
(second) options to start with is a PDF text extractor.

There are tons of them for Windows, MacOS and Linux and even
apps like:

     https://www.pdftext.net/

I use "pdftotext", part of the popular Poppler suite.

That should work for the majority of PDF files out there, but not all.
The ones that are produced by scanners without OCR passes will
not contain text and there is nothing to extract that way. In this case
your option is to pass it by and OCR.

Please give us more detail, and we will probably be able to help you better.

Best,
Paulo Ney



On Fri, May 6, 2022 at 8:22 AM David Jonah via texhax <texhax at tug.org>
wrote:

> I want to convert a .pdf document to a LaTeX document. The paper has
> superscripts, an index, and a source document.
>
> Sent from my iPad
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/texhax/attachments/20220506/444e3df7/attachment.html>


More information about the texhax mailing list.