`limitations' of OzTeX (was: fontinst with 8y.etx)

Berthold Horn bkph@ai.mit.edu
Wed, 17 Jun 1998 16:16:47 -0400

At 12:44 PM 6/17/98 -0700, Melissa O'Neill wrote:

>8r and 8y *don't* have the same glyph complement 8y includes `cwm',
>`nbspace' and `sfthyphen', which are not in 8r. Also, I suppose I
>shouldn't have said ``the number of glyphs that map to empty slots''
>and instead said ``the number of empty slots into which glyphs are placed''.

OK, I see.   Those are there in case a font has them.  Rather than have
space and hyphen each appear in two places.

>Perhaps my tools were overly zealous in their reporting here. This
>problem again stems from those three glyphs above. There seems to be
>disagreement on what the three glyphs that 8y calls `cwm', `nbspace'
>and `sfthyphen' should be called.  Adobe refers to `nbspace' as
>`nobreakspace' (e.g. in Adobe's chsttabl.pdf) but usually replaces it
>with space in encodings, `sfthyphen' is sometimes called `softhyphen'
>(e.g.  on the HP LaserJet 4000) 

I believe nbspace and sfthyphen are by far the most widely used spellings
of the names for these mythical beasts :-)

>and `cwm'(*) is sometimes called
>`compwordmark' (although I can't remember where right now).

>	  * In fact, I've never seen a `compwordmark'/`cwm' glyph in any 
>	    font, so I have no idea what it might look like, or whether it
>	    really needs to be in 8y.

Well, T1 / EC has one (char code 23).  It is also referred to as `bom'
(byte order mark)
in Unicode.  Why is it in LY1?  Not because you'd want to print this `glyph'
but because it is a convenient pseudo glyph for constructing `boundary
character' kern pairs in TeX TFMs.  Similarly, sfthyphen is useful if
you want to construct TFMs with hanging hyphens.

Notice that cwm is placed in a totally useless position (10) --- which some
software and some OS's have hard-wired to line separator.  This is
because it never appears in any actual text output.  So this is a good
use for a slot that is otherwise useless.

>Another issue is that I don't have a definative source for the Windows
>ANSI Encoding. One source is WinLatin1Encoding present in my HP LaserJet
>4000, and another is the WinAnsiEncoding described in the PDF 1.1
>specification -- and neither has sfthyphen, nbspace or cwm.

Right, they repeat space in 160 and hyphen in 173.  And no cwm.

>Given the amount of naming diagreement, I wonder if there is any good
>reason to include these glyphs in 8y.

See above.  Given all the complaining about the `waste' due to repeated
of other glyphs, isn't it nice that 160 and 173 are not used for space and

I don't think stfhyphen, nbspace and cwm are a big issue though (other than in
counting slots and conflicts).

>P.S. I've enclosed the output of my encoding comparison tool, so you can
>see the kind of data tool I was working with.

Thanks, that explains it.

Regards, Berthold.