[texhax] determining LICR base char?

Tue Jul 19 23:37:30 CEST 2016

Given a control sequence (or argument, or literal character, or
whatever) that is a possibly-accented character, either literally or
using control sequences, is it possible to determine the "base"
character in LaTeX?  For example, a hypothetical \getbasechar macro that
can work like this:

\def\foo{\'a}
\getbasechar\foo -> a

\def\bar{^^c3^^a1} % UTF-8 for aacute
\getbasechar\bar -> a

Similarly, and probably most commonly, for the actual binary bytes of
UTF-8 (or whatever, assuming the inputenc is properly set).

Any ideas?  I looked through the LaTeX documentation and sources,
general web searches, etc., but confess I failed to find the answer.  It
must be there somewhere ...  (There were references to section 7.11.2 in
the LaTeX Companion, but unfortunately I don't have that handy.)

This comes up in tex4ht, where lettrine.4ht wants to insert some CSS to
kern against a big dropped A (for example), which should also apply to
Aacute, Agrave, whatever.

It's not a big deal, just thought I'd ask.  --thanks, karl.