[texhax] Using accented characters in source

Ulrike Fischer news3 at nililand.de
Sun May 2 15:39:01 CEST 2010

Am Sat, 1 May 2010 13:16:16 -0400 (EDT) schrieb Michael Barr:

> Now I have a simple way of actually entering an accented character in my 
> source code (using AllChars) I began to wonder if it is easy to get TeX to 
> interpret them correctly.  This is obviously a Windows-specific question. 

Sorry but you are a bit late ;-) I'm certainly entering accented
chars directly in my sources for more than 10 years. The package
inputenc exists since 1994. 

> If I compile the file
> \documentclass{article}
> \usepackage[french]{babel}
> \begin{document}
> Université
> \end{document}
> with or without the second line, the output is simply Universit.  This 
> doesn't surprise me because the Windows code page does not match any of 
> the font encodings, as far as I know.  Still I expected that the é would 
> generate some output.  It is hex 82 and, from the table on page 261 of The 
> LaTeX Companion, first edition, I would have expected C with an acute 
> accent.

You are using OT1-encoded fonts. This fonts have only 128
characters. The last position with a glyph is hex 7F. So é is mapped
to "nothingness".
With \usepackage[T1]{fontenc} you would use fonts with 256 chars and
you would get an output. 

> It is not really important; only for a few foreign words in an English 
> language text, but I am still curious.  I can always say Univerist\'e (or 
> I could make the è active and define it to be that, but then it wouldn't 
> hyphenate.

hyphenation has nothing to do with the input. It doesn't matter if
you use inputenc + é or \'e (inputenc makes é active and maps it to
\'e anyway). If you want hyphenation after chars like é, ä, ö , ü
you should use a suitable font encoding (\usepackage[T1]{fontenc})
which maps this input to single glyphs instead of contructing the
glyph by putting an accent above a glyph.
Ulrike Fischer 

More information about the texhax mailing list