[XeTeX] Finding out if a font supports a particular Unicode character and using it

Chris Jones cjns1989 at gmail.com
Tue Feb 2 11:19:32 CET 2010


On Mon, Feb 01, 2010 at 08:11:25AM EST, Peter Baker wrote:

Hello Peter,

I ran some tests on my debian squeeze system and I did not see much
difference.


Here are the versions I tested with:

--------------------------------------------------------------
[04:02:18][/usr/share/fonts/truetype/unifont]$ fontforge -version
Copyright (c) 2000-2009 by George Williams.
 Executable based on sources from 23:48 GMT 23-Sep-2009.
 Library based on sources from 17:32 GMT 14-Sep-2009.
fontforge 20090923
libfontforge 20090914
[04:07:06][/usr/share/fonts/truetype/unifont]$ dpkg -l | grep fontforge
ii  fontforge          0.0.20090923-1    font editor
ii  libfontforge1      0.0.20090923-1    font editor - runtime library
ii  python-fontforge   0.0.20090923-1+b1 font editor - Python bindings
--------------------------------------------------------------

Test index build with GNU/unifont:

--------------------------------------------------------------
[03:22:05][/usr/share/fonts/truetype/unifont]$ free
             total       used       free     shared    buffers     cached
Mem:        385172     379952       5220          0        252       9760
-/+ buffers/cache:     369940      15232
Swap:      1172704     600960     571744

[03:24:29][/usr/share/fonts/truetype/unifont]$ fontswith -b -n -V .
Building or revising index /home/user/.fontswith/fontswith.index
Building index for 2 fonts
Examining font ./unifont.ttf
Examining font
Couldn't find a font file named /usr/share/fonts/truetype/unifont/
 is not in a known format (or is so badly corrupted as to be unreadable)
Could not open
Elapsed time: 430.119107962

[03:33:00][/usr/share/fonts/truetype/unifont]$ free
             total       used       free     shared    buffers     cached
Mem:        385172      54372     330800          0        336      14424
-/+ buffers/cache:      39612     345560
Swap:      1172704     159940    1012764
--------------------------------------------------------------

It takes about seven minutes to run on a PIII 650MHz with a 5400rpm HDD which
is a bit less than I earlier reported.

Test character search:

--------------------------------------------------------------
[03:40:07][/usr/share/fonts/truetype/unifont]$ fontswith aleph

These fonts contain the glyph aleph:
DejaVuSans: /usr/share/fonts/truetype/ttf-dejavu/DejaVuSans.ttf
DejaVuSans-Bold: /usr/share/fonts/truetype/ttf-dejavu/DejaVuSans-Bold.ttf
DejaVuSans-BoldOblique: /usr/share/fonts/truetype/ttf-dejavu/DejaVuSans-BoldOblique.ttf
DejaVuSans-Oblique: /usr/share/fonts/truetype/ttf-dejavu/DejaVuSans-Oblique.ttf
DejaVuSansCondensed: /usr/share/fonts/truetype/ttf-dejavu/DejaVuSansCondensed.ttf
DejaVuSansCondensed-Bold: /usr/share/fonts/truetype/ttf-dejavu/DejaVuSansCondensed-Bold.ttf
DejaVuSansCondensed-BoldOblique: /usr/share/fonts/truetype/ttf-dejavu/DejaVuSansCondensed-BoldOblique.ttf
DejaVuSansCondensed-Oblique: /usr/share/fonts/truetype/ttf-dejavu/DejaVuSansCondensed-Oblique.ttf
FreeMono: /usr/share/fonts/truetype/freefont/FreeMono.ttf
FreeMonoBold: /usr/share/fonts/truetype/freefont/FreeMonoBold.ttf
FreeMonoBoldOblique: /usr/share/fonts/truetype/freefont/FreeMonoBoldOblique.ttf
FreeSerif: /usr/share/fonts/truetype/freefont/FreeSerif.ttf
FreeSerifBold: /usr/share/fonts/truetype/freefont/FreeSerifBold.ttf
Mathematica1Mono: /usr/share/fonts/truetype/mathematica/Mathematica1m.ttf
Mathematica1Mono-Bold: /usr/share/fonts/truetype/mathematica/Mathematica1mb.ttf

[03:46:07][/usr/share/fonts/truetype/unifont]$ free
--------------------------------------------------------------

This took about the same time and presumably gives correct results. I was
expecting the index to be used by default, but apparently it was not and the
default font tree at /usr/share/fonts was searched. 

Interestingly, fontforge did not even look at /usr/share/fonts/truetype/unifont.

Test character search with the supposedly corrupt index specified:

--------------------------------------------------------------
[04:02:00][/usr/share/fonts/truetype/unifont]$ fontswith ~/.fontswith/fontswith.index U+05D0

These fonts contain the glyph U+05D0:
unifont: ./unifont.ttf

[04:02:18][/usr/share/fonts/truetype/unifont]$
--------------------------------------------------------------

This only took about 18 seconds and despite the messages to the effect
that the index was corrupt, the aleph U+05D0 glyps was found.

All attempts to limit memory up to 360 Meg cause fontforge to terminate:

--------------------------------------------------------------
[03:22:05][~]$ ulimit -v 360000
[04:25:17][~]$ fontforge /usr/share/fonts/truetype/
arphic/          freefont/        latex-xft-fonts/ mathematica/     ttf-dejavu/      unifont/
[04:25:17][~]$ fontforge /usr/share/fonts/truetype/unifont/unifont.ttf
Copyright (c) 2000-2009 by George Williams.
 Executable based on sources from 23:48 GMT 23-Sep-2009.
 Library based on sources from 17:32 GMT 14-Sep-2009.
Attempt to allocate memory failed.
Aborted
--------------------------------------------------------------

I would assume fontswith would fail in the same manner.

This suggests that fontforge somehow needs to allocate a large chunk of
memory in certain cases: I also tested with the arialuni.ttf font file
from the msttfcorefont package, which is larger than GNU/unifont, and
ran into the same problem. 

It would appear that for large font files, fontforge needs at least half
a gig of memory. There may be good reasons for that but it does strike
me as a bit odd that you would need that much to process somewhere in
the neighborhood of 20 Meg of input data.

In any case, it looks like there's not much that can be done where
fontswith is concerned, maybe check with the fontforge developers if
there is anything that could be done to make it work on legacy hardware.

Lastly, I would like to mention that I install fonts that are not part
of my distribution in /usr/local/share/fonts.

Is it possible to build a global index that includes all fonts on the
system?

Thanks,

CJ


More information about the XeTeX mailing list