[tex-k] makeindex breaks up index group on a capitalized entry

geolsoft at mail.ru geolsoft at mail.ru
Sat Aug 21 16:05:50 CEST 2004


Thanks to the top-grade comments by Thomas Esser and Olaf
Weber, my problem (see thread "[tex-k] \write writes out
TeX's `^^' quartets instead of 8-bit chars") is fully
resolved now.

Now I have another problem, but I am not sure this is the
right place to post it to.  If it actually is not, could
someone please kindly recommend me an appropriate mailing
list?

I now post this message to tex-k at tug.org,
tex-eplain at tug.org, and texhax at tug.org (people who are
subscribed to few of these, please don't be too hard on
me---I don't know where to seek help).

I've already tried posting to CyrTeX-ru mailing list which
deals with Russian/Cyrillic TeXing, but nothing came out of
it (mailing list <CyrTeX-ru.vsu.ru>, message at
https://info.vsu.ru/Lists/CyrTeX-ru/Message/2199.html, but
it's all in Russian; below I give translation into English,
with more details.)



The problem I have is related to generating Cyrillic indices
with makeindex(1) by Pehong Chen, version 2.14 [02-Oct-2002]
(kpathsea + Thai support).

I use tetex-bin 2.0.2-10, Web2C 7.4.5 on Debian GNU/Linux
2004-04-02 Sid.

I use plain TeX format and I read in eplain macro package,
but it seems that my problem is local to makeindex.

I have these for my locale settings:

$ echo $LC_CTYPE; echo $LC_COLLATE
LC_COLLATE=ru_RU.KOI8-R
LC_CTYPE=ru_RU.KOI8-R


where ru_RU.KOI8-R is the locale for Russian with KOI8-R
encoding (my TeX sources and intermediate files are all in
this encoding).

The source file is:

--------------------start testidx.tex--------------------
\input eplain

\idx{Meksika}
\idx{martyshka}
\idx{motylyok}
\idx{moloko}
\idx{masshtab}
\idx{marazm}

\readindexfile{i}

\bye
--------------------end testidx.tex--------------------

(I have transliterated index entries into Latin, in the
actual testidx.tex they are all equivalent Cyrillic
characters from the extended ASCII.)

Now I run

$ tex -translate-file=cp8bit testidx
This is TeX, Version 3.14159 (Web2C 7.4.5)
 (/usr/share/texmf/web2c/cp8bit.tcx)
(./testidx.tex (/usr/share/texmf/tex/eplain/eplain.tex)
No index file testidx.ind. [1] )
(see the transcript file for additional information)
Output written on testidx.dvi (1 page, 200 bytes).
Transcript written on testidx.log.


and get testidx.idx:

--------------------start testidx.idx--------------------
\indexentry{Meksika}{1}
\indexentry{martyshka}{1}
\indexentry{motylyok}{1}
\indexentry{moloko}{1}
\indexentry{masshtab}{1}
\indexentry{marazm}{1}
--------------------end testidx.idx--------------------

(again, I transliterated Cyrillic chars into Latin).  So far
so good.  Now, when I run

$ makeindex -L testidx.idx
This is makeindex, version 2.14 [02-Oct-2002] (kpathsea +
Thai support).
Scanning input file testidx.idx....done (6 entries accepted,
0 rejected).
Sorting entries....done (17 comparisons).
Generating output file testidx.ind....done (16 lines
written, 0 warnings).
Output written in testidx.ind.
Transcript written in testidx.ilg.


I get the following in my testidx.ind:

--------------------start testidx.ind--------------------
\begin{theindex}

  \item marazm, 1
  \item martyshka, 1
  \item masshtab, 1

  \indexspace

  \item Meksika, 1

  \indexspace

  \item moloko, 1
  \item motylyok, 1

\end{theindex}
--------------------end testidx.ind--------------------

(again, I transliterated Cyrillic chars into Latin).  As you
can see, index entries are sorted correctly, even the
capitalized `Meksika' got into right place, but makeindex
broke up the group for letter `m' on the capitalized word.
makeindex does not do this for index entries in Latin chars.

What puzzled me the most is that makeindex sorted entries
correctly, knowing that Cyrillic `M' is a capitalized
Cyrillic `m', which indicates that makeindex applied my
locale settings (it does correct _sorting_ even on more
complex inputs, with entries starting on many different
letters), but still it _broke up the group for letter `m'_
on a capitalized word.

I must have skipped a step or two, or maybe I misunderstand
how makeindex works with locale settings?  Or maybe I need a
special style file for makeindex to stop breaking Cyrillic
index groups on capitalized entries?

Thanks in advance for any clues on this.


Best regards,
Oleg Katsitadze


P.S.  The contents of testidx.ilg and testidx.log, if it
helps:

--------------------start testidx.ilg--------------------
This is makeindex, version 2.14 [02-Oct-2002] (kpathsea + Thai support).
Scanning input file testidx.idx....done (6 entries accepted, 0 rejected).
Sorting entries....done (17 comparisons).
Generating output file testidx.ind....done (16 lines written, 0 warnings).
Output written in testidx.ind.
Transcript written in testidx.ilg.
--------------------end testidx.ilg--------------------


--------------------start testidx.log--------------------
This is TeX, Version 3.14159 (Web2C 7.4.5) (format=tex 2004.8.19)  21 AUG 2004 15:53
 (/usr/share/texmf/web2c/cp8bit.tcx)
**testidx
(./testidx.tex (/usr/share/texmf/tex/eplain/eplain.tex)
Missing character: There is no М in font cmr10!
Missing character: There is no е in font cmr10!
Missing character: There is no к in font cmr10!
Missing character: There is no с in font cmr10!
Missing character: There is no и in font cmr10!
Missing character: There is no к in font cmr10!
Missing character: There is no а in font cmr10!
\openout2 = `testidx.idx'.

Missing character: There is no м in font cmr10!
Missing character: There is no а in font cmr10!
Missing character: There is no р in font cmr10!
Missing character: There is no т in font cmr10!
Missing character: There is no ы in font cmr10!
Missing character: There is no ш in font cmr10!
Missing character: There is no к in font cmr10!
Missing character: There is no а in font cmr10!
Missing character: There is no м in font cmr10!
Missing character: There is no о in font cmr10!
Missing character: There is no т in font cmr10!
Missing character: There is no ы in font cmr10!
Missing character: There is no л in font cmr10!
Missing character: There is no е in font cmr10!
Missing character: There is no к in font cmr10!
Missing character: There is no м in font cmr10!
Missing character: There is no о in font cmr10!
Missing character: There is no л in font cmr10!
Missing character: There is no о in font cmr10!
Missing character: There is no к in font cmr10!
Missing character: There is no о in font cmr10!
Missing character: There is no м in font cmr10!
Missing character: There is no а in font cmr10!
Missing character: There is no с in font cmr10!
Missing character: There is no ш in font cmr10!
Missing character: There is no т in font cmr10!
Missing character: There is no а in font cmr10!
Missing character: There is no б in font cmr10!
Missing character: There is no м in font cmr10!
Missing character: There is no а in font cmr10!
Missing character: There is no р in font cmr10!
Missing character: There is no а in font cmr10!
Missing character: There is no з in font cmr10!
Missing character: There is no м in font cmr10!

No index file testidx.ind. [1] )
Output written on testidx.dvi (1 page, 200 bytes).

--------------------end testidx.log--------------------



More information about the tex-k mailing list