[tex-live] clash between babel (french), hyperref and \cite on keys with a colon character

Reinhard Kotucha reinhard.kotucha at web.de
Tue Sep 23 23:48:54 CEST 2014

On 2014-09-22 at 22:07:42 +0200, Zdenek Wagner wrote:

 > 2014-09-22 20:36 GMT+02:00 Robin Fairbairns <Robin.Fairbairns at cl.cam.ac.uk>:
 > > i've never done anything non-trivial with luatex.  getting the
 > > encodings wrong, like that, sounds like something missing
 > > somewhere.  i would poke around with heiko's documentation (or
 > > possibly his huge collection of small packages in
 > > macros/latex/contrib/oberdiek on ctan).
 > >
 > The problem lies in Adobe. The PDF documentation says that the
 > bookmarks have AdobeStandardEncoding. In newer versions (nowaday's
 > versions fall into this category) you can override this default by
 > starting a unicode string with a BOM. The only thing which is needed
 > is to supply "unicode" as a hyperref option and it will create unicode
 > bookmarks with BOM.

Bookmarks can be either in PDFDocEncoding or in UTF-16.
PDFDocEncoding was the preferred encoding in the Early Middle Ages. 

If you want to use Unicode for bookmarks and/or the pdfinfo dictionary
strings must be in UTF-16 and must begin with a BOM.  

Neither pdfTeX nor LuaTeX support UTF-16 natively.  But a string is a
sequence of bytes.  Hence you can create any string, regardless of its
encoding, if you are able to create a byte with any value between 0
and 255.

What hyperref does when invoked with the "unicode" option is to add
bytes to a string in octal representation.

The letter "X" is "\000\130" in UTF-16, for example.  These octal
escape sequences are supported by the PDF standard.

This means that you don't need an engine which is Unicode-aware.
8-bit engines like Knuth's TeX or pdfTeX are sufficient.


Reinhard Kotucha                            Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                    mailto:reinhard.kotucha at web.de

More information about the tex-live mailing list