r/LaTeX 9d ago

PDF to LaTeX conversion

What are some good PDF to LaTeX conversion programs that can be downloaded so the conversion doesn’t have to be done online? I can pay for the software.

18 Upvotes

19 comments sorted by

31

u/BooklessLibrarian 9d ago

The simplest method would to be trying to open the PDF in Word, then saving it as a .docx. Use pandoc to convert the .docx to a .tex file.

That said, you will still probably need to edit the file—as far as I know, there aren't any good tools that do what you're looking for without you needing to do some editing.

5

u/Ophiochos 9d ago

That technically takes it online as the conversion happens that way. If OP means ‘without internet access’ it won’t work. Acrobat pro has an export to word option I think. If it lets you work offline that’s the only one I know but as BooklessLibrarian [interesting username] says export/conversion is rarely clean. It depends if you have footnotes for instance (won’t work). I would just work with the plaintext from copy paste tbh.

1

u/MasterOfLegendes 8d ago

Calibre has good pdf to docx

1

u/Ophiochos 8d ago

Last time I looked (a while ago) calibre basically said ‘oh god if you must‘. Pdf is a bugger to convert For so many reasons…

1

u/MasterOfLegendes 8d ago

try google drive. Open the pdf with google docs from drive..

2

u/QBaseX 7d ago

Libre Office can also sometimes open PDFs.

8

u/DevMahasen 9d ago

Sterling PDF. Can be installed locally but the installation is very involved, requiring Docker and a bunch of other dependencies.

6

u/xte2 8d ago

There is non conversion because pdf it's a compiled language. You might find "translators" who try to read text, math, output LaTeX who might produce something closely similar but nothing more. AFAIK there are only some ML tools online to read math and generate LaTeX to reproduce it, but that's is.

Also pdfs do contains graphs for instance, do you imaging how they can be "converted" to a "source form"?

You ask for something technically unfeasible. You have various small local tools to manipulate pdfs, like pdftotext, pdftk, pdfimages and it's young brother pdfcpu, some GUIs like pdfArranger pdftk various frontends, ImageMagick might be helpful sometimes, some have even made web-based UI compilation of such tools like the new Sterling PDF but NONE could get a pdf and produce correct LaTeX to recreate the pdf with some changes.

2

u/AstroBullivant 8d ago edited 8d ago

I should have said translation instead of conversion. The images such as graphs in a pdf would simply be stored as image files like jpegs and the file path would appear in a tex file.

3

u/Safe-Specialist3163 8d ago

https://www.sciaccess.net/en/InftyReader/ I heve never used it but you can try the trial mode, in which the recognition is limited to one page each time, and 5 pages per day.

2

u/abhunia 8d ago

Use notebooklm

2

u/time_integral 6d ago

if its not a huge document, just use screenshots and chatGPT

1

u/Mr_Misserable 7d ago

Mathpix is a pretty good alternative, it gets the results but you will have to edit, since it's a mix between latex and markdown (they call it notes) but you can open it in overleaf.

The catch is that you won't get anything fancy.

-4

u/dubbel_G 9d ago

Mathpix or chatgpt

2

u/AstroBullivant 9d ago

Both of those require uploading the file online.

5

u/dubbel_G 9d ago

Sorry, I skimmed through your question. Missed that part

-1

u/AstroBullivant 9d ago

I forgot to be more specific. Pdf to .tex would be the most ideal.