[tex-live] Better ways to find packages and documentation

Florent Rougon f.rougon at free.fr
Fri Jul 6 18:58:27 CEST 2007

Reinhard Kotucha <reinhard.kotucha at web.de> wrote:

> Florent Rougon writes:
>  >>> 1) Can a given CTAN package be split among several TEXMF trees (in TL,
>  >>>    in MiKTeX, etc.)? Or rather, do we want to support that?
>  >>
>  >> No.
>  > Good. Then we can have the metadata be self-contained in each TEXMF
>  > tree, with relatives paths from the base of the TEXMF tree. This is nice
>  > from a theoretical POV and allows natural extension for TEXMFLOCAL data.
> A package can be split among a TEXMF tree and several bin/<platform>
> trees.  There will be certainly one database for the whole system.

So, you're contradicting Norbert here. Which changes the infrastructure
from which the DB is built.

I'm getting a bit tired of this discussion (and I physically cannot type
too much anyway, and we've now gotten to the point where I have to slow
down). So, I believe I'll simply design my DB format myself and read the
DB files from /usr/share/progname, /usr/local/share/progname and
~/.progname/ on Unix.

When your file formats are ready and contain the data I asked for, I'll
write a bridge from yours to mine.

This will be better for several reasons, among which:
  - although having each TEXMF tree provide its own part of the DB is
    elegant, it reduces generality.

    For instance, some distro may choose to ship the .sty files for a
    package in a tree and the doc files in another tree, which in effect
    splits a CTAN package between two TEXMF trees, and therefore makes
    it either impossible or ugly (duplication of data) to have each
    TEXMF tree provide its own part of the DB.

  - directly reading the tlpdb files is ignoring MiKTeX and others. Of
    course, they can translate their data to tlpdb format, but:
      1) It's a bit ugly, some part of their data may not fit in the
         tlpdb canvas, or some fields of tlpdb may have no meaning for
      2) They will have to suffer from the constraints imposed by the
         tlpdb format, such as no space in file name, no double-quote in
         attribute values.

> When you mention TEXMFLOCAL I suppose that you have documentation in
> mind.  TeXLive itself will definitely not touch anything in TEXMFLOCAL
> or TEXMFHOME.  It would be nice if programs like texdoctk look for
> databases in TEXMFLOCAL and TEXMFHOME, too.  But IMO these databases
> have to be maintained by the users.

Since it appears that having each TEXMF tree ship its own part of the DB
is not desirable, things won't be in TEXMFLOCAL or TEXMFHOME, but in 
/usr/local/share/progname and ~/.progname/. But the feature will be

> I'm not against using a standard file format, but I think it shouldn't
> be much more verbose than Norbert's format.  XML is much too verbose
> and too difficult to parse.

XML is not meant to be parsed by hand.

> texlive.tlpdb is needed by the installer.  Writing an installer which
> works on all platforms is difficult enough, and I'm happy at least
> that Norbert's database can be parsed so easily without any extra
> tools.  And I don't want to depend too much on external tools.  There
> had been a nice Perl module for FTP access on CPAN a few years ago.
> But the author found that there are severe bugs in it and instead of
> fixing the bugs he simply removed it from CPAN.  If we are using a
> simple file format as proposed by Norbert, we can maintain the tools
> we need ourself and avoid a lot of trouble

If Perl doesn't have a module that can reliably parse XML using the
basic features (not even namespaces), then it must be a really crappy

> I do not see any advantage using RFC 2822.  Parts of it are even
> completely braindead:


>    There are two limits that this standard places on the number of
>    characters in a line. Each line of characters MUST be no more than
>    998 characters, and SHOULD be no more than 78 characters, excluding
>    the CRLF.

This is indeed a bit ridiculous, and one of the many reasons why I
prefer XML. This is because contrary to XML, RFC-2822 is old (well, its
ancestors are) and wasn't designed as a general format for holding data.

> In which world are we living?  CRLF is required by mechanical teletype
> machines.  You must be quite old if you ever have seen such a beast.

Ya know, CRLF is required on MS Windows...

> This means to replace
>      <key> <value>
> by
>      <key>: <value>
> but I don't see the advantage.  And there is absolutely no good reason
> to limit the length of a string to such a ridiculous value.

The (small) advantage is being able to parse it not by hand, but by
using reliable standard libraries, for languages that provide such
libraries. And using a standard file format allows one to use tools such
as grep-dctrl on the DB files, which can be quite nice.

ACK for the limitation in line length.

> Whatever you decide, I think that texlive.tlpdb is quite good and easy
> to parse.  Minor changes are not a big problem, I can adapt the script
> easily.  But I definitely refuse to make the installer dependent on
> external parsers, modules, libraries, tools,... 

I won't decide anything for you. Don't be afraid.


More information about the tex-live mailing list