different hyphenation between pdftex and luatex

David Carlisle d.p.carlisle at gmail.com
Sat Sep 11 11:03:25 CEST 2021


(I think this is more texhax at tug.org than tex-live list but as it's here...)

The hyphenation in luatex is different in many ways to that of classic
tex even when they use the same \pattern data so it's not that
surprising that you get different results for some constructs.

Here the main issue is that apart from some legacy compatibility
luatex has moved away from the (frankly weird) reliance on lowercase
codes for determining which characters take part in hyphenation. So
luatex sees the whole construct as a single word and looks it up using
the patterns, pdftex sees the * as a word boundary and doesn't start
the next word until after it sees some white space so skips most of
this.

If you set the lccode of * to itself then you get the same result as luatex

$ pdftex '\lccode`\*=`\* \showhyphens{nutzer*innengerechte}\bye' |
egrep 'This|nutzer'
This is pdfTeX, Version 3.141592653-2.6-1.40.23 (TeX Live 2022/dev)
(preloaded format=pdftex)
[] \tenrm nutzer*in-nen-gerechte


David



On Fri, 10 Sept 2021 at 20:05, Harald Koenig via tex-live
<tex-live at tug.org> wrote:
>
> Hi (Lua)TeX experts,
>
> Marei Peischl noticed that using "*" for gender forms in german language
> gives different hyphenation when using luatex and pdftex -- pdftex does not hyphenate, luatex does:
>
>         $ pdftex '\hsize=5mm \showhyphens{nutzer*innengerechte}\bye' | egrep 'This|nutzer'
>         This is pdfTeX, Version 3.141592653-2.6-1.40.23 (TeX Live 2021) (preloaded format=pdftex)
>         [] \tenrm nutzer*innengerechte
>
>         $ luatex '\hsize=5mm \showhyphens{nutzer*innengerechte}\bye' | egrep 'This|nutzer'
>         This is LuaTeX, Version 1.13.2 (TeX Live 2021)
>         [] \tenrm nutzer*in-nen-gerechte
>
> luatex gives same hyphenation as pdftex without that gender "*"
>
>         $ pdftex '\hsize=5mm \showhyphens{nutzerinnengerechte}\bye' | egrep 'This|nutzer'
>         This is pdfTeX, Version 3.141592653-2.6-1.40.23 (TeX Live 2021) (preloaded format=pdftex)
>         [] \tenrm nutzerin-nen-gerechte
>
>
> thanks for any insights;)
>
> Harald
> --
> "I hope to die                                      ___       _____
> before I *have* to use Microsoft Word.",           0--,|    /OOOOOOO\
> Donald E. Knuth, 02-Oct-2001 in Tuebingen.        <_/  /  /OOOOOOOOOOO\
>                                                     \  \/OOOOOOOOOOOOOOO\
>                                                       \ OOOOOOOOOOOOOOOOO|//
>                                                        \/\/\/\/\/\/\/\/\/
> Harald Koenig                                           //  /     \\  \
> harald.koenig at mailbox.org                              ^^^^^       ^^^^^


More information about the tex-live mailing list.