Print the codepoint of a unicode glyph represented by a latex command.

Michal Hoftich michal.h21 at gmail.com
Mon Jan 17 23:49:00 CET 2022


Hi Bill

> Now, after looking at this in a browser where the
> \text?oldstyle characters failed, I am remembering that the
> codepoints from U+E000 to U+F8FF are a private use area,
> and, therefore, the values above for \textoneoldstyle and
> \textnineoldstyle can only work with special arrangements.
> I don't know what the values for these should be.  Tex4ht's
> invocation of the private use area suggests to me that the
> \text?oldstyle characters do not have public codepoints.

I would say that this is an error in TeX4ht support for the particular
font used in this document. We have tables with mapping from character
codes to Unicode for fonts supported by TeX4ht. Originally, these
mapping were created by hand, but we can create them automatically
using Htfgen now:

https://github.com/michal-h21/htfgen

The Unicode characters are based on glyph names used in the fonts. We
use glyphlists from various sources, and some of them contain mapping
from glyphs to private unicode area. Which is bad, as these cannot
reliably work in browsers. The current version of Htfgen ignores such
mappings, but this  font mapping was created by an older version that
didn't ignore them.

This is why PUA characters are used. I will fix the mapping in TeX4ht
to use regular numbers instead. Old style numbers can be requested
using CSS.

Best regards,
Michal


More information about the texhax mailing list.