[tex4ht] [bug #241] grave accent letter ` (hex 60) changes to left single quotation mark (hex 0xE2 0x80 0x98)

Karl Berry karl at freefriends.org
Sun Jan 18 00:45:22 CET 2015


    btw, I think Nasser had found many errors in .htf files in last two
    weeks and and also for many fonts, .htf files are missing. 

I don't doubt it.  No .htf has been created (in the distribution anyway)
since Eitan died.  It would be great to cover some of the new fonts.

    my idea is following: we can take property list of a tfm file

I doubt the encoding info in the TFM file is especially reliable even in
the few cases where it's present.  (Ditto afm2pl.)
    
    and find postscipt name of the character in corresponding .enc
    file. we can get unicode code point for postscript name from
    glyphlist.txt and texglyphlist.txt files included in TeX
    distribution.

Wow, quite a project.

    for these FONTSPECIFIC I have to use
    google to find out actually used encoding 

For fonts created through the otftotfm process, i.e., nearly everything
that Michael Sharpe and Bob Tennent have done, who have contributed many
of the new fonts (Sharpe did newtx), there should be an opaquely-named
(a bunch of hex chars) .enc file in the font package corresponding to
every tfm.  As I understand it.

Anyway, in general, I expect that talking to the package developer or
looking at the sources would be more fruitful than random web searches.
(Not to say it'll be easy, no matter what.)

    but sometimes two or more glyphs are used to create character
    (mainly accents), so we can't get post script name of such character
    even if we knew encoding of referenced glyphs

All I can think of is to have heuristics or a table saying that a
composition of character X + character Y in font F means Unicode point
U.  Since it's generally about accents, the combinations should be
finite, and repeated through many different fonts.

Thanks,
K


More information about the tex4ht mailing list