[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

More on 5+3+3 font name abbreviations



In my posting of Tue, 30 Dec 1997 17:53:42 -0700 (MST) to the
tex-fonts list, I reported collision statistics for a 5+3+3 font name
abbreviation algorithm reported by Rebecca and Rowland
<rebecca@astrid.u-net.com> to be in use by Adobe Systems.

On two sets of font files, the collision rates were 28% (of 1036 font
names) and 38% (of 97 font names).

Rebecca and Rowland subsequently reported that the algorithm should
consider all words in a font name, so it should be 5+3+3+...., or 53ff
for short.

With this updated awk program, augmented to list the colliding names,
and to trim trailing blanks, which fooled the previous version in some
cases:

/^FullName / {
	sub(/ +$/,"",$0)
        name53ff = substr($2,1,5)
	for (k = 3; k <= NF; ++k)
	    name53ff = name53ff substr($k,1,3)
        if (name53ff in used) collisions++
        used[name53ff]++
}

END {
  print collisions, "collisions,", FNR, "font names:\n"
  for (name53ff in used)
	if (used[name53ff] > 1)
	    printf("%2d %-21s\t%s\n", \
		   used[name53ff], name53ff, FullName[name53ff])

}

and now using all the name parts, I find many fewer collisions:

2 collisions, 1023 font names:

 2 HelveObl                     [Helvetica Oblique]
                                [Helvetica.Black Oblique]
 2 Lucia                        [Lucia]
                                [Lucian]

6 collisions, 97 font names:

 4 CoppeGotThiBC                [Copperplate Gothic Thirty BC]
                                [Copperplate Gothic Thirty-One BC]
                                [Copperplate Gothic Thirty-Three BC]
                                [Copperplate Gothic Thirty-Two BC]
 2 BodonBol                     [Bodoni Bold]
                                [Bodoni BoldItalic]
 3 CoppeGotThiAB                [Copperplate Gothic Thirty AB]
                                [Copperplate Gothic Thirty-One AB]
                                [Copperplate Gothic Thirty-Two AB]

Clearly, if `-' and `.' in font names are replaced by blanks before
applying the 53ff algorithm, then only two collisions will remain, for
Lucia and BodonBol.

By adding one more statement in the END action,

  for (name53ff in FullName)
	if (used[name53ff] == 1)
	    printf("%2d %-21s\t%s\n", \
		   length(name53ff), name53ff, FullName[name53ff]) | \
	      "sort +0nr -1 +1 -2 | head -10"

I also found the longest 10 abbreviations in each collection:

1023 fonts:
24 MinioItaDisSmaCap&OldFig     [Minion Italic Display Small Caps & Oldstyle Figures]
24 MinioRegDisSmaCap&OldFig     [Minion Regular Display Small Caps & Oldstyle Figures]
24 MinioSemItaSmaCap&OldFig     [Minion Semibold Italic Small Caps & Oldstyle Figures]
23 AvantGarGotITCP.SBooObl      [Avant Garde Gothic ITC P.S. Book  Oblique]
23 AvantGarGotITCP.SDemIta      [Avant Garde Gothic ITC P.S. Demi Italic]
21 MinioItaSmaCap&OldFig        [Minion Italic Small Caps & Oldstyle Figures]
21 MinioRegSmaCap&OldFig        [Minion Regular Small Caps & Oldstyle Figures]
21 MinioSemSmaCap&OldFig        [Minion Semibold Small Caps & Oldstyle Figures]
20 AvantGarGotITCP.SBoo         [Avant Garde Gothic ITC P.S. Book]
20 AvantGarGotITCP.SDem         [Avant Garde Gothic ITC P.S. Demi]

97 fonts:
18 ITCAvaGarGotBooObl           [ITC Avant Garde Gothic Book Oblique]
18 ITCAvaGarGotDemObl           [ITC Avant Garde Gothic Demi Oblique]
17 12BauBodBolIta003            [12 Bauer Bodoni** Bold Italic   00392]
17 AdobeGarExpBolIta            [Adobe Garamond Expert Bold Italic]
17 AdobeGarExpSemIta            [Adobe Garamond Expert Semibold Italic]
17 TimesNewRomBolIta            [Times New Roman Bold Italic]
15 ITCAvaGarGotBoo              [ITC Avant Garde Gothic Book]
15 ITCAvaGarGotDem              [ITC Avant Garde Gothic Demi]
15 ITCLubGraBooObl              [ITC Lubalin Graph Book Oblique]
15 ITCZapChaMedIta              [ITC Zapf Chancery Medium Italic]

Obviously, non-alphanumeric characters should also be removed from the
full name before applying the 53ff algorithm, and it may be worthwhile
to further split at transitions between lower and uppercase letters in
words, but it is noteworthy that the longest abbreviation is still
within the 31-character Macintosh file name limit.

Finally, I took a list of 7638 unique font names (see
http://www.math.utah.edu/~beebe/fonts.html) from multiple vendors,
inserted spaces at lowercase->uppercase changes (this is not
necessarily the same as the FullName value from the .afm files, but I
don't have the latter to do the experiment) and ran the 53ff
algorithm.  This time, it reports:

37 collisions, 7638 font names:

 2 CasloBTBolIta        	[Caslon BT Bold Italic]
				[Caslon224ITCby BT Bold Italic]
 2 FormaReg             	[Formal436BT Regular]
				[Formata Regular]
 6 CheltOldStyNo2       	[Cheltenham Old Style No2T]
				[Cheltenham Old Style No2TIn1]
				[Cheltenham Old Style No2TOu1]
				[Cheltenham Old Style No2TRe1]
				[Cheltenham Old Style No2TRo1]
				[Cheltenham Old Style No2TSh1]
 2 PerpeExp             	[Perpetua Exp]
				[Perpetua Expert]
 2 DfDivPla             	[Df Diversions Plain]
				[Df Diversities Plain]
 2 CasloBTBol           	[Caslon BT Bold]
				[Caslon224ITCby BT Bold]
 6 GroteNo9             	[Grotesque No9T]
				[Grotesque No9TIn1]
				[Grotesque No9TOu1]
				[Grotesque No9TRe1]
				[Grotesque No9TRo1]
				[Grotesque No9TSh1]
 2 GoudyTexMTLomCap     	[Goudy Text MT Lombardic Capitals]
				[Goudy Text MT Lombardic Caps]
 3 MICR1BTReg           	[MICR10by BT Regular]
				[MICR12by BT Regular]
				[MICR13by BT Regular]
 2 WitteFraMT           	[Wittenberger Frakt MT]
				[Wittenberger Fraktur MT]
 2 ScripPla             	[Scriptease Plain]
				[Scriptek Plain]
 2 CentuBolCon          	[Century Bold Condensed]
				[Century725BT Bold Condensed]
 2 CentuBol             	[Century Bold]
				[Century725BT Bold]
 6 AlterGotNo1          	[Alternate Gothic No1D]
				[Alternate Gothic No1DIn1]
				[Alternate Gothic No1DOu1]
				[Alternate Gothic No1DRe1]
				[Alternate Gothic No1DRo1]
				[Alternate Gothic No1DSh1]
 6 AlterGotNo2          	[Alternate Gothic No2D]
				[Alternate Gothic No2DIn1]
				[Alternate Gothic No2DOu1]
				[Alternate Gothic No2DRe1]
				[Alternate Gothic No2DRo1]
				[Alternate Gothic No2DSh1]
 6 AlterGotNo3          	[Alternate Gothic No3D]
				[Alternate Gothic No3DIn1]
				[Alternate Gothic No3DOu1]
				[Alternate Gothic No3DRe1]
				[Alternate Gothic No3DRo1]
				[Alternate Gothic No3DSh1]

24 BodonAntTDemBolConItaIn1	[Bodoni Antiqua T Demi Bold Condensed Italic In1]
24 BodonAntTDemBolConItaOu1	[Bodoni Antiqua T Demi Bold Condensed Italic Ou1]
24 BodonAntTDemBolConItaRe1	[Bodoni Antiqua T Demi Bold Condensed Italic Re1]
24 BodonAntTDemBolConItaRo1	[Bodoni Antiqua T Demi Bold Condensed Italic Ro1]
24 BodonAntTDemBolConItaSh1	[Bodoni Antiqua T Demi Bold Condensed Italic Sh1]
24 FrankGotItcDBooExtComIn1	[Franklin Gothic Itc D Book Extra Compressed In1]
24 FrankGotItcDBooExtComOu1	[Franklin Gothic Itc D Book Extra Compressed Ou1]
24 FrankGotItcDBooExtComRe1	[Franklin Gothic Itc D Book Extra Compressed Re1]
24 FrankGotItcDBooExtComRo1	[Franklin Gothic Itc D Book Extra Compressed Ro1]
24 FrankGotItcDBooExtComSh1	[Franklin Gothic Itc D Book Extra Compressed Sh1]

and the longest abbreviated name still fits on the Macintosh.

----------------------------------------------------------------------------
- Nelson H. F. Beebe                  Tel: +1 801 581 5254                 -
- Center for Scientific Computing     FAX: +1 801 581 4148                 -
- University of Utah                  Internet e-mail: beebe@math.utah.edu -
- Department of Mathematics, 105 JWB                   beebe@acm.org       -
- 155 S 1400 E RM 233                                  beebe@ieee.org      -
- Salt Lake City, UT 84112-0090, USA  URL: http://www.math.utah.edu/~beebe - 
----------------------------------------------------------------------------