[texhax] read argument until EOL

Heiko Oberdiek heiko.oberdiek at googlemail.com
Fri Jan 7 18:25:53 CET 2011


On Fri, Jan 07, 2011 at 04:14:55PM +0000, Philip Taylor (Webmaster, Ret'd) wrote:

> Heiko Oberdiek wrote:
> 
> >"1" might have the wrong uccode different than 0 or `1.
> >(But I would consider it as quite small risk.)
> 
> I think that this code addresses that point, Heiko, but of
> course there are many other things that assume a standard
> environment : for example, should I also explicitly (re-)set
> the \catcode of "~" ?  If so, what about the all the other
> characters used, all of which are assumed in my version to
> have their standard \catcode s at the point of elaboration ?
> How far should one go, in programming defensively ?

There are two point in time:
* Definition time
* Usage time

Definition time
* Of course all catcodes cannot be controlled, because
  otherwise it is not possible to write a command.
  In my packages, especially the packages that are also compatible
  to plain TeX or iniTeX I store the catcode values at the beginning
  of the package of most characters, set them to known values and
  restores them at the end.
* Usually I also save and restore other values like \uccode,
  \escapechar.
  Lately I had a bug report for hyperref, because someone
  had loaded hyperref with \endlinechar=-1.

Usage time
* Because macros might be called inside verbatim environments,
  great care is needed. IMHO the best strategy is to avoid
  catcode (and similar) dependend stuff at macro usage time.

The catcode of `~' belongs to the definition time.
Thus it depends on the conventions of the used format, if
the catcode of `~' must be set explicitly. I would set the
catcode (and restore it afterwards), because I do not know
a format that explicitly requests that the catcode of `~'
is active and forbids changing it.

For example,

>     \def \eolsection
> 	{%
> 		\begingroup
>     		\catcode `\^^M = \active
> 		\uccode `\1 = 0
> 		\uccode `\~ = `\^^M
> 		\uppercase {\def \innereolsection ##1~}{\endgroup \message {##1}}\relax
> 		\innereolsection
> 	}

can be rewritten in many ways to avoid \uppercase and `~'.
In the following example I have also added \protected to
macro \eolsection if e-TeX is available.

\begingroup
  \def\firstofone#1{#1}%             
  \catcode`\^^M = \active %
\firstofone{\endgroup %
  \def\innereolsection#1^^M%
}{%
  \message{#1}%
}
\begingroup\expandafter\expandafter\expandafter\endgroup
\expandafter\ifx\csname protected\endcsname\relax
\else
  \protected
\fi
\def\eolsection{%
  \begingroup
    \catcode `\^^M = \active
  \expandafter\endgroup
  \innereolsection
}

\eolsection AAA

\eolsection BBB

xxx
\eolsection CCC
yyy

\end

It looks more complicate. But I also wanted:
* to avoid global definitions,
* to have a clean definition text of \innereolsection
  without changed `^^M',
* to have a better runtime complexity of \eolsection

Of course I prefer my solution with \eolgrab, because it
makes the feature independent from \section and the
definition of \eolsection can be done at user level:

\def\eolsection{\eolgrab\section}

Yours sincerely
  Heiko Oberdiek


More information about the texhax mailing list