[tex-live] Problems with non-7bit characters in filename

Zdenek Wagner zdenek.wagner at gmail.com
Sun Jul 6 10:52:50 CEST 2014

2014-07-06 4:05 GMT+02:00 Reinhard Kotucha <reinhard.kotucha at web.de>:
> On 2014-07-05 at 22:33:26 +0100, Klaus Ethgen wrote:
>  > Did you read my Post and the one of Robin Fairbairns? I didn't
>  > wrote any about "useless". You said that.
>  >
>  > I did just show some technical facts about UTF-8. Not more and not
>  > less.
>  >
>  > Pleas stay objective.
> Klaus, I must admit that I understand why people can't follow you.
> The purpose of LuaTeX is to provide a TeX system which supports
> Unicode.  It works like a charm but you deliberately broke it because
> you are against Unicode.  So why do you use LuaTeX at all?
> And why should LuaTeX support Latin1?  Why don't you simply stick to
> ASCII?  You obviously believe in these crappy national 8-bit
> encodings.  But please keep in mind that most people (Chinese,
> Koreans, ...) need more than 256 characters.
This is even problem for eastern Europe. Some 30 years ago Latin1 was
hardwired in CGA and Hercules monitors as well as in printers and
could not be changed. At that time brothers Kamenický invented an
encoding where č was displayed as ç etc. so that the Czech and Slovak
texts were quite readable. And they created such a keaboard driver as
a TSR for MS-DOS. Newer CGA and EGA monitors alowed definition of
characters so they created files for monitors and printers that
enabled to use Czech and Slovak characters. This unofficial encoding
ramained in use more than 20 years (and probably TeX support files
still exist). IBM came with official CP852 code page, Unix with ISO
8859-2, later Windows came with CP1250 while the black Windows text
console still uses CP852 (even now). Thus I always had to convert
files from one encoding to another. I still have to do it because not
all users have switched to Unicode.

> Why, on earth, should LuaTeX make an exception for you?
> It's a matter of fact that things don't work for you because you broke
> your system deliberately.  You expect that someone provides a
> workaround for a problem which you can avoid easily.  Everything works
> fine on Unix.  Admittedly, on Windows the situation is worse.
I understand what Klaus means. He has Latin1 locale and types

luatex something

where "something" contains umlaut characters in Latin1 but luatex
wants unicode. His installation is simple because he probably does not
access disks mounted from other network sources so that the charset of
his file system is the same as his system locale (but it need not be
true in general). Thus what he want, just passing the octets from the
command line as octets for the file system, is wrong in principle, it
cannot work, it will cause even more troubles if files are shared over
heterogeneous network. It would bea valid feature request to ask
luatex and xetex to convert the command line according to the system
locale and when opening files to convert the file name according to
the filesystem charset. Now it depends whether someone considers such
a feature request that important to implement it. What I see from the
posts is that Klaus does not understand the difference between octets
and characters.

> Regards,
>   Reinhard
> --
> ------------------------------------------------------------------
> Reinhard Kotucha                            Phone: +49-511-3373112
> Marschnerstr. 25
> D-30167 Hannover                    mailto:reinhard.kotucha at web.de
> ------------------------------------------------------------------

Zdeněk Wagner

More information about the tex-live mailing list