Compressing formats
Hironori KITAGAWA
h_kitagawa2001 at yahoo.co.jp
Wed Dec 4 02:46:33 CET 2019
Hello all,
As latex-dev preloads expl3 (https://www.latex-project.org/news/2019/11/28/latex-dev-2020-2/),
size of format files are greatly increased:
# format engine (latex) -> (latex-dev)
latex (pdftex) 4291213 -> 8067042
pdflatex (pdftex) 4291275 -> 8067104
platex (eptex) 4557309 -> 10411263
uplatex (euptex) 4553312 -> 10407071
xelatex (xetex) 3765560 -> 4507996 (compressed by zlib)
So I am doing an experiment for compressing formats
(of pTeX and friends, to begin with) by lz4(hc):
https://github.com/h-kitagawa/texlive-source/tree/lz4hc-fmt
In this experiment,
* platex-dev.fmt becomes 2861929 bytes (about 6.5MB smaller).
* processing "\documentclass{minimal}\begin{document}\end{document}" by platex
has almost no overhead (128 ms vs 134.8 ms).
A test result is located at
https://github.com/h-kitagawa/texlive-source/tree/lz4hc-fmt/texk/web2c/eptexdir/tests/comp-tests
----
I know that XeTeX and LuaTeX compress formats by zlib, and
(e)(u)pTeX and pdfTeX are already linked with zlib (because SyncTeX).
However, I choose lz4(hc) for decompression speed.
* I also tested with zlib (-1), lzo (1x_1). See table at
https://github.com/texjporg/tex-jp-build/issues/96 (Japanese) for detail.
* Linking lz4 library increases binary size by about 100 KB,
but compression makes platex-dev.fmt smaller about 6.5MB (see above),
so the total is better.
* We can use some time (but not much!) for compressing,
because dumping formats is considered to be less often.
I choose default lz4(hc) compression level 5 (or 6?) is a good tradeoff between
compression rate and compression time.
--
Hironori KITAGAWA <h_kitagawa2001 at yahoo.co.jp>
More information about the tex-live
mailing list