git compresses texlive/texmf-dist/tex for 2010 to 2020 from 3.5G to 205M

Jonathan Fine jfine2358 at gmail.com
Wed Jun 30 19:40:11 CEST 2021


Hi

I've made a start on putting TeX Live into git. I've started with the
folder texlive/texmf-dist/tex for the years 2010 to 2021 inclusive. I cover
only the snapshots as in the TeX Collection DVD. Many thanks to Hartmut
Henkel, whose real-time compressions during last week's TeX Hour inspired
me to do more.

The headline is that on disk this occupies about 3.5GB, but in git it's
about 205M. This is a compression ratio of about 205 / 3500 = 6%. By the
way, the download time is similarly reduced.

Here's the 3.5G mentioned earlier
$ du -s -c -h texlive20??/texmf-dist/tex
164M texlive2010/texmf-dist/tex
180M texlive2011/texmf-dist/tex
207M texlive2012/texmf-dist/tex
218M texlive2013/texmf-dist/tex
251M texlive2014/texmf-dist/tex
270M texlive2015/texmf-dist/tex
305M texlive2016/texmf-dist/tex
312M texlive2017/texmf-dist/tex
341M texlive2018/texmf-dist/tex
387M texlive2019/texmf-dist/tex
412M texlive2020/texmf-dist/tex
475M texlive2021/texmf-dist/tex
3.5G total

You can see what I've got at
https://github.com/jfine2358/temp-texlive-texmf-dist-tex. I call it temp
because it's a rough first version, to be thrown away after I've learnt
from the experience.

Alternatively, you can do a BARE clone, as follows. The --bare is required
unless you want 3.5GB of multiple copies of identical (or nearly
identical?) files.

$ time git clone --bare git at github.com:
jfine2358/temp-texlive-texmf-dist-tex.git
Cloning into bare repository 'temp-texlive-texmf-dist-tex.git'...
remote: Enumerating objects: 77116, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 77116 (delta 0), reused 1 (delta 0), pack-reused 77115
Receiving objects: 100% (77116/77116), 202.08 MiB | 1.80 MiB/s, done.
Resolving deltas: 100% (47359/47359), done.

real 1m57.756s

Alternatively, you can download the 4 release files at
https://github.com/jfine2358/temp-texlive-texmf-dist-tex/releases/tag/texlive-2010-2021%2Ftexmf-dist%2Ftex

Here's the big 200M pack file. The rest are quick to download. Twice the
speed of git clone --bare.

$ time wget
https://github.com/jfine2358/temp-texlive-texmf-dist-tex/releases/download/texlive-2010-2021%2Ftexmf-dist%2Ftex/pack-72ebac230848eb026fd1a9eac0765531afc5e322.pack

real 0m58.752s

Please explore, and comment if you wish.

with best regards

Jonathan

.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/tex-live/attachments/20210630/7f0f1649/attachment.html>


More information about the tex-live mailing list.