[metapost] Glyph names in METATYPE1

Boguslaw Jackowski B_Jackowski at GUST.org.pl
Sat Mar 2 22:01:18 CET 2013

Hello, Everybody,

>   (c) there is a (known) trouble with the syntax of numbers: a digit
>       followed by the letter 'e'... -- this is easy to circumvent
>       (but still annoying; we have a few formulas containing
>       <number>eps or <number>epsilon expressions in our mpost programs)

> Not to derail your discussion,

You're welcome!

As we wrote some time ago, we'd be more than happy if you could
derail a few more discussions personally at a BachoTeX meeting...

> In METATYPE1, the name specified as parameter to beginglyph() is scanned
> into tokens and used as a suffix, even if specified as a string, and what
> gets written into the Postscript file is those tokens translated back into
> a string rather than the originally-specified string.  The most
> significant consequence is that hexadecimal numbers don't survive the
> round trip through the scanner.


Needless to say, we're aware of this probblem. Perhaps it was our
somewhat unfortunate decision to use both suffixes and strings to
denote glyphs, but in some situations suffixes are more convenient
(more handy in notation) while in some -- strings are better
(more universal).

> I think it might work to insert an underscore before every character,
> turning "uni4E00" into "_u_n_i_4_E_0_0".

We came to nearly the same conclusion, although slightly different
in details.

Metatype1 provides a macro `assign_name' that makes life a little bit
easier, although one may wonder if it is the best solution.

Assume that you have to declare (in Metatype1 lingo -- "to introduce")
a glyph of name `uni0400', whatever its meaning. You may use the following 

    standard_introduce("uni0_400"); assign_name _uni0_400("uni0400");

i.e., for the purposes of processing within Metapost, the name
(the suffix) uni0_400 can be used, while the name uni0400 will be
exported to a Type1 font.

In our fonts, we prefer to use meaningful glyph names (suffixes)
rather then names stuffed with underscores.

Actually, the converter from Type1 to Metatype1, pf2mt1, generates
statements of this form, although it inserts more underscores.
The algorithm for "sanitizing" the names (written, of course,
in AWK ) is fairly simple:

   function sanitize_name(n,  n0,r) {
     if (n in SANITIZED) return SANITIZED[n]
     else {
       gsub(/[\-\+]/,"!", n)
       gsub(/;/,"?", n) # weird: XP Arial (2001) contains something like this!
       while (match(n,/[0-9][0-9]/)) {
         r=r substr(n,1,RSTART) "_"; n=substr(n,RSTART+RLENGTH-1)
       n=r n
       while (n in XANITIZED) n=n "~"; SANITIZED[n0]=n; XANITIZED[n]
       return n

where SANITIZED and XSANITIZED are auxiliary global tables (vocabularies).

As you rightly noted, periods should be actually treated like digits.
We trod onto this trap quite recently, because of names like `mu1.alt'
(str mu1.alt yields, obviously, "mu1alt"). We will try to fix this as
soon as possible (but not sooner).

So, the main difference between your approach and ours is that we
use less underscores (still, less underscores could be used, however,
at the price of the complicating of the sanitizing procedure).

Is our comment to your comment comprehensible or at least sufficcient? :-)

Cheers -- Jacko & Piotr

  Bogus\l{}aw Jackowski: B_Jackowski at GUST.ORG.PL
  Hofstadter's Law: It always takes longer than you expect, even
                    when you take into account Hofstadter's Law.

More information about the metapost mailing list