[XeTeX] Please help me with \XeTeXinterchartoks

Jonathan Kew jonathan at jfkew.plus.com
Tue Dec 2 12:37:30 CET 2008


On 2 Dec 2008, at 15:08, VAFA KHALIGHI wrote:

> Dear Ross and Ulrike, thanks for your wonderful and helpful answers.  
> it seems that I need to wait to see what Jonathan says.
>
> I just want to know if what I want to achive is possible or not? so  
> yes or no?

Your approach didn't really make sense to me.... you don't want to  
give every letter a different class, if you're doing that you might as  
well make each one a distinct macro. The point of *classes* is that  
you can handle a whole collection of similarly-behaved characters as a  
unit.

(Note that there are some default class assignments already preloaded  
in the format files; see unicode-letters.tex. (We really should have a  
\newclass allocator, it just hasn't gotten done yet.) So you may want  
to avoid clashing with those.)

Anyhow, here is a small example:

\documentclass{article}
\usepackage{bidi}
\usepackage[cm-default]{fontspec}
\newfontfamily{\ar}[Script=Arabic]{Scheherazade}
% classes 1-3 are used in unicode-letters.tex, so we'll put the Latin  
letters in 4
\newcount\n
\n=`\A \loop \XeTeXcharclass \n=4 \ifnum\n<`\Z \advance\n by 1 \repeat
\n=`\a \loop \XeTeXcharclass \n=4 \ifnum\n<`\z \advance\n by 1 \repeat
% when we encounter class 4, we'll do \startlatin
\XeTeXinterchartoks 0 4 {\startlatin}
\XeTeXinterchartoks 255 4 {\startlatin}
% and when we encounter class 0, we'll do \finishlatin
\XeTeXinterchartoks 255 0 {\finishlatin}
\XeTeXinterchartoks 4 0 {\finishlatin}
\newif\iflatin
\newcommand{\startlatin}{\iflatin\else\bgroup\beginL\rm\latintrue\fi}
\newcommand{\finishlatin}{\iflatin\unskip\endL\egroup{ }\fi}
\XeTeXinterchartokenstate=1
\begin{document}
\setRL\ar
السلام عليكم
hello world
وعليكم السلام
\end{document}

However,  I suspect you're not really going to be able to do this on a  
large scale, because it will be too difficult to handle things like  
punctuation and spacing at direction changes. In unidirectional text,  
it may not matter whether the "language switch" happens before or  
after the space (or punctuation mark), but with bidi it does matter. I  
think in the end you're still going to need markup if you want to  
reliably mix LR and RL scripts.

JK



More information about the XeTeX mailing list