[tex-live] xindy vs utf8 latex
jschrod at acm.org
Thu May 8 01:58:52 CEST 2014
On 05/07/14 13:42, Arthur Reutenauer wrote:
> You beat me to it, Reinhard ;-)
> Yes, sorting *is* hard.
That's why xindy was created. :-) It was the attempt to formalize
and resolve the experience that I had from creating Multi-lingual
MakeIndex (v3.0, never widely distributed) that created more
problems than it solved.
It was successful in so far as it created a framework that handles
different sort orders world-wide quite well. It has still
deficiencies in encoding handling -- we didn't understand the
encoding problem space at the time when we wrote xindy, back in
2004. (Yeah, this year is xindy's 10th birthday!!)
The right way to go would actually be to allow an arbitrary input
encoding (including LICR), converting that to UCS-32 (which happens
to be the natural character encoding of the used LISP engine),
define all sort orders in Unicode codepoints (a.k.a. UCS-32)
instead of UTF-8, and define a separate output re-encoding step.
Definitively on the TODO list, but don't hold your breath. :-)
Joachim Schrod, Roedermark, Germany
Email: jschrod at acm.org
More information about the tex-live