[tex-live] Stable vs. Unstable/Testing Update Repositories?

C.M. Connelly cmc at math.hmc.edu
Thu Feb 25 00:38:59 CET 2010


"MP" == Manuel Pégourié-Gonnard <mpg at elzevir.fr>

    MP> We're currently discussing the options internally for
    MP> maintaining multiple releases without doubling or so the
    MP> space occupied by TeX Live on CTAN mirrors.  One of the
    MP> options discussed is using links (though we were thinking
    MP> symlinks rather than hardlinks), if the CTAN admins think
    MP> it can properly work with all mirrors.

There are a few key advantages of hard links over symlinks for
mirroring.

First, hard links use minimal space on the filesystem -- a symbolic
link is an actual file on the filesystem that takes up some amount
of space and contains a pointer to another file.  Hard Links are
notations in the directory that one file *is* another file (they
use the same inode).*  Rsync (with the -H flag) takes advantage of
the inodes being the same to make copying hard links quicker.

Second, rsync (with the -H flag) will transparently replace a link
with a new file if one of the files that was a link is changed on
the upstream copy.

Third, even without rsync's -H option or local filesystem support
for hard links, you can still copy the whole tree as separate
files (it will just take longer or take up more space on the local
disk).

For comparison, here are some numbers from making copies of my
Linux (x86_64 and i386) install of TeX Live 2009:

  Original tree       2459.25 MiB
  Symlinked tree       637.35 MiB
  Hardlinked tree       53.37 MiB


There are tools for creating hardlinked trees quickly -- ``cp -al
tree1 tree2'' (with a GNU cp) will create a copy of directory
tree1 as tree2 with the directory structure intact, but the files
replaced with hard links.  I use this method to duplicate some
large software packages before updating them; for example, when
updating some toolboxes within a MATLAB install, I might duplicate
the existing install with hard links, then do the updates on the
duplicate tree.  When I'm happy with the new version, I can delete
the old version, and I'm left with a working tree.  Even better, I
can have multiple versions on the server at the same time, but use
way less space than completely separate installs would use (which
is very helpful when you have different licenses for the same
application that support different subsets of functionality).

Hardlinkpy, hardlink++, and similar tools will recursively scan a
directory for files that can be hardlinked.  It's common to run
these tools occasionally on mirrored data in case the upstream
mirror broke the links or never made them in the first place, but
the bandwidth savings only come if upstream uses hard links.[*]

   Claire

[*] Sadly, hardlinkpy can only find 190 MiB of potential savings
    in the CTAN mirror, but it would obviously help with multiple
    TeX Live trees.

*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
  Claire Connelly                              cmc at math.hmc.edu
  System Administrator                           (909) 621-8754
  Department of Mathematics                 Harvey Mudd College
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 188 bytes
Desc: not available
URL: <http://tug.org/pipermail/tex-live/attachments/20100224/d22aa38b/attachment-0001.bin>


More information about the tex-live mailing list