[tex-live] Problems with non-7bit characters in filename
Klaus+texlivelist at ethgen.ch
Sat Jul 5 12:18:41 CEST 2014
-----BEGIN PGP SIGNED MESSAGE-----
Am Sa den 5. Jul 2014 um 2:24 schrieb Reinhard Kotucha:
> On 2014-07-04 at 09:42:45 +0100, Robin Fairbairns wrote:
> > Reinhard Kotucha <reinhard.kotucha at web.de> wrote:
> > > > While latin1 can include every possible character, UTF-8 cannot.
> > >
> > > This is definitely wrong. The opposite is true.
> > no, it's correct: iso 8859-1 has no "forbidden" octets (it does, iirc,
> > have some unassigned ones)
> > whereas
> > utf-8 rejects some octets in some contexts, since it's generating a
> > 32-bit glyph from 8-bit input. (it's complicated. honest.)
> True, but we were talking about characters and a character is not
> necessarily an octet. I suppose that the confusion arose bcause we
> don't use the term 'character' in the same way.
Sorry, I mostly mean characters in the sense of C programs. Octet would
be the correct naming. In case of UTF-8 characters it would be better to
use »codepoints« instead of »characters«.
And to clearify what I meant before (even if Robin made it clear
before), UTF-8 has forbidden _octets_ while latin1 has not. And even
worse, the forbidden octets could even not be translated from latin1 to
UTF-8 as I remember correct. (I mean that octets that you would watch as
waste in the terminal when using latin1; but it could leastwise be
displayed and used to rename the file.
Klaus Ethgen http://www.ethgen.ch/
pub 4096R/4E20AF1C 2011-05-16 Klaus Ethgen <Klaus at Ethgen.de>
Fingerprint: 85D4 CA42 952C 949B 1753 62B3 79D0 B06F 4E20 AF1C
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
-----END PGP SIGNATURE-----
More information about the tex-live