[Fontinst] Mapping all diacritics to actual glyphs rather than composites

Christopher Adams chris at raysend.com
Tue Jan 19 15:27:17 CET 2010


Lars,

Thanks again for all of your patience and help. I have found a workable
solution for my needs: getting the ogoneks and commaaccents to output
correctly using pdflatex.

To summarize for those just tuning in, I overcame the limitations of both
AdobeStandardEncoding (8r) and T1 (8a) by writing my own encoding vector, as
well as a few choice customized hacks to access the new glyphs.

I began with a copy of 8r.enc, and simply swapped out glyphs that are not in
Palatino/Palladio (dotlessj, some f-ligatures, infinity, etc) for the ones I
needed (Aogonek, tcommaccent, etc.)

I then made a copy of t1.etx, and swapped out the slots that are not in
Palladio (tcedilla, etc.) with slots for my new glyphs, and used this when I
called installfont in my fontinst script.

Lars suggested that I write another encoding in order to declare text
composite commands to access any new commaaccent glyphs. That looked a bit
intimidating to me, and so I resorted to a quick hack.

I'm somewhat embarassed to say that I simply dropped this command into my
style file:

\def\cb#1{%comma below
\ifx#1S\char30\else\ifx#1T\char149\else%
\ifx#1s\char26\else\ifx#1t\char181\else#1%
\fi\fi\fi\fi}

I did notice that Palladio has a spare "commaccent" glyph that could surely
be brought in to draw all the other commaaccents on the fly. I'll leave that
as an exercise for another day.

The upshot seems to be that, yes, each font really should have its own
encoding vector. Although the above methodology is exceedingly clumsy, I've
certainly profited from the effort.

Two more things I should mention. In all this I'm not actually setting
extended text in Turkish, Romanian or Polish; merely the occassional word in
each language.

As for the slots, it's trivial to cross-reference the AFM, 8r.enc and t1.etx
and see where room can be made.

One caveat: when I ran my test file through, I noticed that I had to have my
new encoding vector (.enc) in the same directory to get it to work.

The relevant line in the map file looks like this:

up9r8r URWPalladioL-Roma <up9.enc <up9r8a.pfb " fontinst-autoenc-up9
ReEncodeFont "

Should I somewhere give this a more proper name or indicate where it can
exist in my TeX directory?

—Christopher

2010/1/18 Lars Hellström <Lars.Hellstrom at residenset.net>

> Christopher Adams skrev:
>
>  I've finally pieced together that the solution is to write my own encoding
>> vector.
>>
>> Is there a good tutorial about how to do this?
>>
>> As a test I simply made a copy of 8r.enc and 8r.etx. In the .enc file I
>> replaced one of the /.notdef's with /eogonek, and then added a \setslot
>> for
>> this glyph in the .etx file. Then I changed \reencodefont{8r} to refer to
>> my
>> renamed, modified copy, ran the files through TeX and updated all the map
>> files.
>>
>> Sure enough, the correct *eogonek* comes out in the PDF (it's even
>> serchable). Perfect!
>>
>> I finally understand the reason that *eogonek* wasn't working is that it
>> isn't defined by 8r.
>>
>> I still have some questions.
>>
>> 1) What's the best way to do what I need?
>>
>
> Depends on your priorities. If they are to get what you need done with as
> little effort as possible, then the "substitute a few glyphs" approach you
> later seemed to decide on is probably optimal. If they instead are best
> quality -- to produce the full font support for Eastern/Central European
> languages using the latin script -- then it might rather be to embark on the
> project of producing a T1A (or whatever) encoding, that can be used for
> those languages where T1 isn't quite sufficient.
>
>
>  2) If I need a glyph that is not in T1, like *iogonek*, in addition to the
>> above, do I then simply have to modify t1.etx and add a slot for that
>> glyph?
>>
>
> As far as fontinst is concerned, yes, and if using the new multislot.sty
> package (which I'm not sure I've ever announced properly), you can write an
> ETX which *only* sets the slots that are different from T1, and then goes
> \inputetx{t1}, thus significantly reducing the amount of editing you need to
> do.
>
> When we get to LaTeX however, things are quite different.
>
>
>  3) If I need a glyph that doesn't even have a TeX command, such as
>> scommaccent, what do I have to do to get access to it in my latex
>> document?
>> I know someone has written a \cb{} command that fakes comma below. Can I
>> write my own \cb{} command? What would it look like? I really only need to
>> access to scommaaccent and tcommaaccent.
>>
>
> This is a matter of the encoding-specific "text commands", which are
> described in the "Encodings" section of fntguide.tex (part of LaTeX distro)
> and Section 7.11 of The LaTeX Companion (2nd edition). You would probably
> want to do something like
>
>  \DeclareTextComposite{\k}{L01}{s}{...}
>
> if following Hilmar's suggestion of using \k also for the comma accent; the
> above would (if ... is replaced by the right slot number) make \k{s} typeset
> an scommaaccent glyph.
>
> What you can't get around is however the need to declare a new LaTeX
> encoding for your fonts, since you'll need some text commands to end up
> doing something different than they would under T1. In the example above I
> used L01 (local encoding 1) for this new encoding, as you'll (at least
> initially) probably only be using it privately, but in the long run I think
> a lot of people would benefit if someone stepped up to designing an encoding
> which fully covers the Eastern European languages T1 only provides a partial
> solution for. Hilmar?
>
>
>  Fortunately, because I'm doing book work, I can make room in the encoding
>> vector by discarding some math symbols and analphabetics.
>>
>
> Are you talking about 8r here? There isn't very much nonalphabetic material
> in T1 to get rid of. Getting something new in pretty much necessitates
> losing some of the precomposed letters...
>
>
>
> Christopher Adams skrev:
>
> > Hi Hilmer,
> >
> > Thank you again for your thoughtful replies.
> >
> > 2010/1/17 Hilmar Schlegel
> [who likes to keep his e-mail secret, wrote]
>
> >
>
>> Well, another question would be: What do you want to do?
>>> Seriously, all depends on what you really need (and not
>>> necessarily what you  believe now you might need).
>>>
>> >
> > At this point I can define my goal quite narrowly: I need an
> > *a/eogonek*rendered as single outlines, as well as t/scommaccent
> > (without loosing
> > scedilla). I see now this should be quite easy to achieve.
> >
>
>> 2) If I need a glyph that is not in T1, like /iogonek/, in
>>>>
>>>> addition to the above, do I then simply have to modify t1.etx and
>>>> add a slot for  that glyph?
>>>>
>>> >>
>
>> That is a practicable solution for some special glyphs. Since T1
>>>
>>> is  full, those are replacements of other glyphs you don't need.
>>>
>> >
>
>> Since my target glyphs are few i number, this would appear this is
>>
> > the best solution.
> >
>
>> If I write a \setslot{scommaaccent} in my modified t1.etx file, I'm
>> still not confident I could typeset it. My low-level TeX is not so
>> good, and so I'm trying to puzzle out the code you sent. At some
>> point in the code  do you
>> have to refer to the glyph by its char number? Or is there another
>> way to refer to whichever glyph occupies the "scommaaccent" slot by
>> name?
>>
> >
>
>> 3) If I need a glyph that doesn't even have a TeX command, such
>>>>
>>>> as scommaccent, what do I have to do to get access to it in my
>>>> latex  document?
>>>> I know someone has written a \cb{} command that fakes comma
>>>> below.  Can I
>>>> write my own \cb{} command? What would it look like? I really
>>>> only  need to
>>>> access to scommaaccent and tcommaaccent.
>>>>
>>> >>
> >>
>
>> Well, you can simply use the ogonek accent command: \k{a} and
>>>
>>> \k{e} are defined as standard Latex commands. Ogoneks are applied
>>> to vowels (like iogonek) and for consonants you simply provide
>>> commaaaccents (g, k,  n, r, s,
>>> t). See the sample for using \k as ogonek/commaaccent for
>>> vowels/consonants.
>>>
>> >>
> >> Here the commaaccent is defined as a specific char code.
> >> You can place it as it suits your needs.
> >
> >
>
>> Am I correct that this code is drawing the glyphs by composing a base
>> letter with a diacritic? But it first checks to see if there is
>>  a real eogonek? I apologize that I'm having trouble following the code.
>> Would it be feasible to have a command that gets \cb{s} and \cb{t} to print
>> the right glyphs, assuming that the .etx has slots for s/tcommaccent?
>>
> >
>
>> s/tcommaccent are a trivial correction for T1: replace scedilla (which
>>> will mean there is no longer Turkish language support) and
>>> use a T1' with a commaccent and redefine (extend) the \k{} command
>>> as in the sample code.
>>>
>> >>
> >
>
>> It seems I can't quite go this route, as I can't lose Turkish as a
>>
> > result.
> >
> >
> >>
>
>> Fortunately, because I'm doing book work, I can make room in the
>>>>
>>>> encoding vector by discarding some math symbols and analphabetics.
>>>>
>>>
>>> Actually you are not free to assign the glyphs in Latex to
>>> arbitrary character codes! All depends on the used hyphenation
>>> patterns for  the needed
>>> languages (Eastern European, Baltic, old Prussian &c). Therefore
>>> T1 can be a reasonable template to start with since there is a high
>>> probability of language support for T1 encoded fonts.
>>>
>> >
>
>> Ok, this is good to know. I'm very curious to know where this "you
>> are not free to assign..." mandate is written.
>>
>
> I've tried to document it in fontinst/doc/encspecs/encspecs.tex. The main
> problem is that TeX's \lccode table establishes a correspondence between
> upper and lower case letters, and if you don't respect these correspondences
> then the hyphenation can get screwed up.
>
>
> >> From the pure typographic view I'd strongly suggest to consider
> >> i) the font Aldus [...]
> >> ii)  Palatino and Aldus Nova OT from Linotype. [...]
> >>
> >
>
>> These are both considerate choices. I very aware of both Aldus and
>> Palatino Nova, and under different circumstances your suggestions
>> would be entirely correct. But in this case I have already
>> determined that the regular and bold weights of Palatino are
>> precisely what I need (as a display face I'm using Sistina). Plus,
>> the fact that I require embedding permissions means that I need
>> Adobe. The fact that I have to tweak some glyphs means I need
>> Palladio.
>>
> >
> > If circumstances were different I probably wouldn't be using any of
>
>> these. It just so happens that the work I am setting deals with
>> literary and cultural history centered in Frankfurt from the 1950s onward.
>> It is as if the choice has been made for me!
>>
> >
> > In point of fact this has all given me a much finer appreciation of
> > Zapf's accomplishment.
> >
> >
> >> iii) for the less strict user of existing implementations it is also
> >> worth to have a look into a Palatino Linotype, which is a TT
> >> (TrueType) font with a larger glyph set. In case you can take the
> >> trouble and make use of large TT fonts for Latex (extract the
> >> metrics, map with fontinst and embed them via Distiller) this can
> >> enrich the glyph set usable by Latex considerably.
> >
> > This is something I'm certainly interested in learning how to do.
> > Considering my modest skill-set I'm determined to stick with Type 1
> > fonts for now. Are there any good resources to learn more about this?
> >
> > My reasons for using fonstinst stem from my reliance on the microtype
> > package in pdflatex. But now I'm really hooked on the flexibility it
> > offers for mashing up fonts. I'm really interested in learning how
> > OpenType is going to change the fontinst landscape. Any pointers?
>
> Well, since you ask about the future... But bear in mind that this is
> probably double-danger material, and very much work in progress:
>
>  http://abel.math.umu.se/~lars/fontinst/bigbase.dvi<http://abel.math.umu.se/%7Elars/fontinst/bigbase.dvi>
>
> Again, be warned: I had the spur of fontinst development resulting in this
> back in September--October, but from November and on I've been doing other
> stuff. There's no reason to believe these mechanisms will be in anything
> resembling working condition within the foreseeable future.
>
> Lars Hellström
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/fontinst/attachments/20100119/ba794ef4/attachment-0001.html>


More information about the fontinst mailing list