[tex-live] packages with characters > 127
Robin.Fairbairns at cl.cam.ac.uk
Thu Dec 31 13:02:37 CET 2009
[nb: i've removed the ctan list from the cc: -- it's quite enough for
just one of us (me) to keep a watching brief, i think. please remove it
if you reply to some other sub branch of the thread.]
Norbert Preining <preining at logic.at> wrote:
> On Do, 31 Dez 2009, Taco Hoekwater wrote:
> > another encoding than the author's document), so in my opinion it be
> > best if package authors would stay away from 8-bit encodings if they
> > want to be optimally portable to luatex. Just my 2c.
> Umpf, I am not overly content with that.
> I mean yes, in an ideal world all would be iso-2022 or UTF32 or whatever
> universal encoding you select, but it isn't. And we have hundreds of
> packages and files here, and billions out in the world (docuemnts
> of users) with legacy encoding. Not being able to read them for
> the most advanced engine is a bit strange.
> Be forgiving with everything you get, but restrictive and strict
> with what you give yourself.
how does one "forgive" a non-standard 8-bit character when you thought
you were reading utf-8? ignore it? (how does the user find why their
characters didn't appear?) produce an error? (users will complain)
produce a warning? (tex is already verbose enough, so users will
the problem is, that unless the processor is told what encoding the
document uses (of the huge numbers available, standardised,
microsoft-used or largely private), it can't in general determine what
the semantic of any character is.
> I cannot propose a solution, and if there is no way around that, so
> it be. What we should try our best.
i fear there is no truly general solution. perhaps the utf-8 gobbling
engines should provide a switch, between the uses above?
(one problem we face is that we can't guarantee that a document and its
packages are in the same encoding, so a switch saying, for example,
--language=iso-8859-6 doesn't completely solve the problem, and could be
vastly distracting if a package was in iso 8859-8.)
More information about the tex-live