# [texhax] The details of \csname, in this specific case

Patrick Rutkowski pmr2141 at columbia.edu
Sat Feb 23 02:27:29 CET 2013

So, I found a nifty little hack online, and I adapted it such that I
can type my TeX sources in UTF8 and have macrons come out correctly.
The code is pasted toward the bottom of this message.

Before I get to my question, I should first stem a few obvious
comments: Yes, I know XeTeX exists. But, I really like xdvi. For some
reason xdvi finds XeTeX's dvi output unworkable, and so I'm sticking
with straight up TeX, so I can keep using xdvi.

Now, onto my actual questions. The below TeX code works, but I don't
exactly know how. I understand what the \catcode is doing, and I
understand \expandafter is doing. Naturally, I'm also very familiar
with how UTF-8 internals work, with variable length sequences and all
that good stuff. What I don't quite understand is what is inside of
the \csname.

1) I would have expected to have to encode c4 and c5 as something like
^^c4 and ^^c5 inside of the \csname, but somehow that is not
required.

2) I thought that \csname took only "character tokens," but wouldn't
something like "c4" be two separate character tokens, first "c" and
then "4"?

3) Moreover, what is that colon doing in between the c4 and the #1?

4) How exactly does TeX come to interpret the #1 as a "character
token," aren't things above value 127 by default labeled "invalid?"

5) And finally, why exactly is the single quote needed before the ^^c4
for the \catcode, but not in the \def?

I've tried to scour the TeXbook for these answers, but I've come up
short handed.

=== [ BEGIN PASTE ] ===
\catcode\^^c4=13
\catcode\^^c5=13
\def^^c4#1#2{\expandafter\def\csname c4:#1\endcsname{#2}}
\def^^c5#1#2{\expandafter\def\csname c5:#1\endcsname{#2}}
ā{\=a}
ē{\=e}
ī{\=\i}
ō{\=o}
ū{\=u}
Ā{\=A}
Ē{\=E}
Ī{\=I}
Ō{\=O}
Ū{\=U}
\def^^c4#1{\csname c4:#1\endcsname}
\def^^c5#1{\csname c5:#1\endcsname}

āēīōūĀĒĪŌŪ

\bye
=== [ END PASTE ] ===

Many thanks
-Patrick