[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 8r fonts




   From: Piet Tutelaers <P.T.H.Tutelaers@urc.tue.nl>

   Checksums for CMR fonts always have been working properly. A mismatch
   is an indication that your METAFONT source differs from the one used
   for generating the TFM file. 

I wouldn't know since I don't use PK fonts.

   The current CS algorithm for TFM fonts derived from PS fonts is:

      unsigned long s1 = 0, s2 = 0;                /* unsigned long! */
      for (i=0; i<256; i++) {
	 if (ev[i] == NULL) continue;
	 s1 = ((s1<<1) ^ (s1>>31)) ^ width[i];     /* cyclic left shift */
	 for (p=ev[i]; *p; p++)
	    s2 = s2 * 3 + *p ;
      }
      return (s1<<1) ^ s2 ;

   Here ev[i] contains a pointer to the character name of the current
   encoding and width[i] its corresponding WX value in the AFM file.

Thank you. I see.

(1) what happens if WX is non-integer (as it is in all CM, AMS etc fonts
    --- or at least if it isn't fractional then you do not have good fonts!)?

   So its is clear that a checksum depends upon (1) the encoding vector (2)
   the WX values from the AFM file for the characters selected through this
   encoding vector. People who use a different AFM file than the one used to
   generate the TFM file on CTAN can get `checksum mismatch' warnings. They
   have either to use the AFM files from CTAN or regenerate the TFM files with
   their AFM files.

OK, but it seems that the AFM file (apparently already reencoded as
defined by you above) itself is fully determined by the font and the
encoding.  Hence actually you only need the name of the encoding
in the checksum to achieve the same effect.  See below.

> Your question makes one thing very clear. There is no way the TeX
> community can garantee that a new standard gets implemented by all PD
> software and vendors. In the beginning we had something like TRIP and
> TRAP tests. But we don't have anything similar for DVI drivers and
> fonts.  Perhaps we should make the above checksum algorithm part of the
> DVI driver standard? 

Please don't.  
Why freeze in methods appropriate mostly for older technology like PK fonts?

   > Which is why AFMtoTFM hides the name of the encoding vector used to
   > generate the TFM in the checksum.  Not only can you later decode this
   > to find out what the encoding vector was, but since this is passed
   > into the DVI file by TeX, it can be checked at the driver/previewer
   > level to see whether it matches what encoding *it* is set up for.
   > Solves many nightmarish debugging problems!  Just as important as
   > having TeX announce on screen that there are `missing characters'.

   I don't understand what you mean by `hiding the encoding vector'.  In
   the current implementation in TFM/VF fonts the checksum is a 32-bit
   number. So there is not much to hide.

Using mod 40 representation you can `hide' 6 characters (a-z 0-9 and
some others).  My idea is not so much to make the checksum unique
(which is not important if you don't use PK files and don't use
TFMs in the driver, instead using the font itself to supply metric
information if it is needed).  

*	The idea is to provide some information beyond the single bit:
*	checksums don't match - which means nothing to most people
*	(like debugging PS errors without ehandler.ps :=)

*	With the above encoding-name-hiding scheme, you can recover
*	the encoding vector and announce some meaningful message like:

ERROR: encoding mismatch, your TFM file was made for `8r'
encoding, but your printer driver is set up for `tex256' encoding.

You can't do that with a complex hashing scheme for checksum.
And a compelx hashing scheme should not be needed since the AFM
file is fixed (should always have the same collection of char names
and advance widths).  True you can get only 6 characters, but then
32 bits is all TeX transfers form the TFM file into the DVI file...

Regards, Berthold.