Print the codepoint of a unicode glyph represented by a latex command.

William F Hammond hmwlfsr at yahoo.com
Mon Jan 17 22:39:02 CET 2022


William F Hammond via texhax <texhax at tug.org> writes:

> Hongyi Zhao <hongyi.zhao at gmail.com> writes:
>
>> . . . 
>> \textminus, \textnineoldstyle, \textohm, \textonehalf, \textoneoldstyle.
>
> . . .
>
> Tex4ht, which runs using latex, will generate
>
>      <p class="noindent" ><span 
>         class="tcrm-1000">&#x2212;</span>, <span 
>         class="tcrm-1000">&#xF739;</span>, <span 
>         class="tcrm-1000">&#x2126;</span>, <span 
>         class="tcrm-1000">&#x00BD;</span>, <span 
>         class="tcrm-1000">&#xF731;</span>.</p>
>

Now, after looking at this in a browser where the
\text?oldstyle characters failed, I am remembering that the
codepoints from U+E000 to U+F8FF are a private use area,
and, therefore, the values above for \textoneoldstyle and
\textnineoldstyle can only work with special arrangements.
I don't know what the values for these should be.  Tex4ht's
invocation of the private use area suggests to me that the
\text?oldstyle characters do not have public codepoints.

I decided to run your example through xelatex (using
textcomp and not fontspec).  It used the unicode font
LMRoman.  I examined that and found \textoneoldstyle in the
private use area at F644 and \textnineoldstyle at F64C.  I
then modified tex4ht's codes to use those values and
modified tex4ht's html output to invoke my webfont version of
Latin Modern Roman (made from lmroman12-regular.otf as found
in texlive using Jonathan Kew's sfnt2woff and served through
my private web server).  It works.

                              -- Bill


Email: hmwlfsr at yahoo.com
       gellmu at gmail.com
https://www.facebook.com/william.f.hammond
http://www.albany.edu/~hammond/

𝑹𝒆𝒎𝒆𝒎𝒃𝒆𝒓, 𝒓𝒆𝒎𝒆𝒎𝒃𝒆𝒓 𝒕𝒉𝒆 𝒔𝒊𝒙𝒕𝒉 𝒐𝒇 𝒕𝒉𝒆 𝒚𝒆𝒂𝒓.





More information about the texhax mailing list.