Patgen

Mojca Miklavec mojca.miklavec.lists at gmail.com
Tue May 14 23:30:09 CEST 2019


Dear Keno,

On Tue, 14 May 2019 at 23:03, Keno Wehr wrote:
>
> Patgen 2.4 (TeX Live 2019) has some problems with huge input lists.
> I have a input list with over 11.000.000 entries, which I need to
> prepare new hyphenation patterns for classical Latin, a heavily
> inflected language.
> The percentages output by patgen during the run are erroneous. The sum
> of good and missed hyphens should be 100 %, but this is not the case.
> Even negative percentages occur.
> Furthermore, patgen aborts in the seventh run with a "PATGEN capacity
> exceeded" message.
> The patgen logs of the first and the seventh run are attached.
>
> Is it possible to adapt patgen for such huge lists?

There was a nice talk two weeks ago:
    http://www.gust.org.pl/bachotex/2019-en/program#section-55
but the author might need a push to publish the slides sooner :)

The talk listed all patgen "rewrites" that would probably get the task
done. One example includes https://github.com/hyphenation/hydra, and
Ryszard wrote his own implementation in Python as well. (Further
options were listed in the talk)

Mojca


More information about the tex-live mailing list