[tex-live] Improving or rewriting texdoctk

Florent Rougon f.rougon at free.fr
Sun Jul 1 20:00:19 CEST 2007


Frank Küster <frank at kuesterei.ch> wrote:

> Have you talked to Thomas Ruedas about that?  I think he also considered
> this. Maybe he has some code (but you'd have to read Perl, which is said
> to be harder than to write it...)

I didn't talk to Thomas for these reasons:
  1. I don't want to write anything significant in Perl.

  2. If we go the debtags[1] way (see later mails), the changes to do
     are not trivial at all and thus warrant a full rewrite; which, for
     me, means Python.

  3. If not, well, the main thing that remains to do is to make texdoctk
     able to assemble the database (/etc/texdoctk/texdoctk.dat) from a
     set of files provided with each LaTeX package. AFAIS, this is a
     simple matter of concatenating the files, and thus is trivial.

     I'm pretty sure you asked Thomas to do that a long time ago and we
     haven't heard any signs of progress since then, so I assumed it
     would be faster for me to reimplement in Python than having Thomas
     actually work on this feature.

     [ The only remaining thing I can think of (excluding the switch to
       a debtags-like approach) is tagging the documents with the
       language they are written it. I was about to say it's trivial
       (simply add a language field to the db) when I saw it is already
       partially present in texdoctk.dat:

         l2kurz;Short Introduction to LaTeX (german);latex/general/l2kurz.pdf;tutorial, introduction, basics, german

       But not all documents are tagged this way:

         KOMAg;KOMA-Script User's Guide (german);latex/koma-script/scrguide.pdf;tutorial, introduction
         gerdoc;German styles (old/new orthography);generic/german/gerdoc.dvi;
         caption2a;caption manual (german) (caption);latex/caption/anleitung.dvi;figures, tables


       Well, it would be slightly more work to make the GUI able to
       limit its view to a user-specified set of languages, or to
       display the language a doc is written as part of the doc lists,
       but this is secondary: I believe we can already take advantage of
       language tags with the Search button. ]

The problems with the trivial approach (as opposed to the debtags route)
are the following:

  - a simple static hierarchical classification as currently implemented
    in texdoctk is never completely satisfactory: when writing the
    metadata for packages, we (ideally upstream authors) will have to
    find the proper category, and sometimes:
      * there are several relevant categories;
      * or there is no appropriate category.

  - if all the documents in TL are referenced this way, I think we'll
    end up with either of these problems:
      * too many categories to be usable (imagine starting texdoctk and
        having 50 categories or so to choose from);
      * some categories will have too many documents in them to be
        conveniently browsable.

That's why I was thinking of implementing a debtags-like approach, which
could actually be used to classify LaTeX packages, not only their
documentation (it would be ridiculous to have an excellent way to
classify and browse LaTeX documentation and no corresponding way to
browse through the packages, since there is most of the time a very
clear relationship between one package and one or more doc files).

  [1] http://debtags.alioth.debian.org/


More information about the tex-live mailing list