[tex4ht] problem with slow compilation of large latex file with large math content

Karl Berry karl at freefriends.org
Sat Mar 26 23:44:29 CET 2016


Hi Nasser,

    TL is more optimized for native Linux vs. cygwin.

Just to remark: TL specifically is not "optimized" for any particular
platform (all binaries are built natively).  I think the difference you
are seeing here is an inevitable consequence of running a
resource-intensive job on an emulation layer (cygwin) vs. a native layer
(gnu/linux).  (As for native Windows, it is fundamentally inefficient,
so I'm not surprised it is slow too.  Cygwin or vbox is thus the worst
of both worlds.)

    buying new PC and installing Linux on it just in the hope

Wow.  I suspect you are the only person in the world buying hardware to
placate tex4ht!

    then it starts to slow down, the higher the number becomes

I think we need some kind of profiling of the TeX run to find the facts.
I don't have an easy recipe at hand.  (And I'm currently trying to get
the next TUGboat out the door, plus prepare for the TL pretest, so it's
going to be hard to devote significant time to this for a while,
unfortunately ...)

    But the issue is, pdflatex and lualatex take about 5 minutes
    on the same file to compile it to pdf !

Ok, so let's consider PDF first, since that is simpler to think about
than HTML.

    I have many many latex files this large

Is the one you provided already one of the smaller ones?  The smaller
the file that still exhibits the problem, the easier to debug.

    given that lualatex takes one hr or so.

Oh, lualatex is another important part of the story.  LuaTeX is already
significantly slower than standard TeX (or XeTeX), and depending on what
your document is doing, we may be hitting some kind of new/unusual
slowdown that is specific to luatex.  Must you use luatex?

    Finally, is there a document that describes the passes/process
    that tex4ht uses to compile to HTML at some high level?

The htlatex script is six lines long, and is the clearest possible
summary of what is run.  I'll omit the TeX gobbeldygook that Eitan
uses.

#!/bin/sh
        latex $5 ...
        latex $5 ...
        latex $5 ...
        tex4ht -f/$1  -i~/tex4ht.dir/texmf/tex4ht/ht-fonts/$3
        t4ht -f/$1 $4

I assume Michal's make4ht is fundamentally equivalent.

As CVR says, the reason for the three latex runs is simply to resolve
references.  Thus if you are repeatedly running the same doc, with all
aux files already in place, one run would suffice.


For sure, a design document, among many others, would be extremely
desirable, not to mention many updates to the code, not to mention a new
release, not to mention ...  What's fundamentally needed are more
volunteers with time and ability to help develop and document this
highly complex system!

karl


More information about the tex4ht mailing list