Introduction
This is the central place for hyphenation patterns in TEX. They are all bundled in a single package called hyph-utf8.
For pattern authors
If you are a pattern author and wish to update your patterns, please contact the hyph-utf8 package maintainers through the tex-hyphen mailing list.
Documentation
Algorithm
Papers
- Documentation (needs improvement)
- Documentation for Lua(La)TEX part of package
- TUG 2008 paper
- Barbara Beeton: Hyphenation Exception Log, TUGboat, Volume 31 (2010), No. 3
Slides
- The TEX hyphenation applied to HTML (Mathias Nater, BachoTEX 2010)
Related Packages
- Babel (pdf; 1659 kb) – for pdfTEX and other 8-bit TEX engines
- Polyglossia (pdf; 169 kb) – for XETEX
Links
Collaboration
- Mozilla
- FOP XML Hyphenation Patterns (Simon Pepping)
- TEX-Hyphen-Pattern (Perl implementation on CPAN) (Roland van Ipenburg)
- Hyphenator.js (Client-side implementation of hyphenation in HTML documents) (Mathias Nater)
OpenOffice.org
- Test TEX/OpenOffice hyphenation algorithm online (based on hunspell)
- Using TEX hyphenation patterns in OpenOffice.org (explains how to properly convert TEX patterns into OpenOffice-friendly form)
- Hunspell (library)
- Open Office language extensions
- text-hyphen (rubyforge); (source code repository)
- TEX Hyphenator in Java
- Knuth-Liang Hyphenation for Haskell
- Indic languages:
- An article about soft hyphen
- TEX line breaking algorithm in JavaScript
Other external links
Languages
The package contains patterns for the following languages:
(if patterns for any other language exist and are missing below please let us know)
name, synonyms | code (link to file) |
(left,right)- hyphenmin |
8-bit encoding |
licence | authors | |
---|---|---|---|---|---|---|
Afrikaans | afrikaans | af | (1,2) | EC | LPPL | Tilla Fick Chris Swanepoel |
Ancientgreek | ancientgreek | grc | (1,1) | LPPL | Dimitrios Filippou | |
ibycus | grc-x-ibycus | (2,2) | ||||
Arabic | arabic | ar | (,) | |||
Armenian | armenian | hy | (1,2) | LGPL | Sahak Petrosyan | |
Assamese | assamese | as | (1,1) | MIT | Santhosh Thottingal | |
Basque | basque | eu | (2,2) | EC | other-free | Juan M. Aguirregabiria |
Bengali | bengali | bn | (1,1) | |||
Bulgarian | bulgarian | bg | (2,2) | T2A | LPPL | Georgi Boshnakov |
Catalan | catalan | ca | (2,2) | EC | LPPL | Gonçal Badenes |
Chinese | pinyin | zh-latn-pinyin | (1,1) | EC | GPL | Werner Lemberg |
Coptic | coptic | cop | (1,1) | LPPL | Claudio Beccari | |
Croatian | croatian | hr | (2,2) | EC | LPPL | Igor Marinović |
Czech | czech | cs | (2,3) | EC | GPL | Pavel Ševeček |
Danish | danish | da | (2,2) | EC | LPPL | Frank Jensen |
Dutch | dutch | nl | (2,2) | EC | LPPL | Piet Tutelaers |
English | english, usenglish, USenglish, american | (default) | (2,3) | ASCII | Donald Knuth | |
ukenglish, british, UKenglish | en-gb | (2,3) | ASCII | other-free | Dominik Wujastyk Graham Toal |
|
usenglishmax | en-us | (2,3) | ASCII | other-free | Donald E. Knuth Gerard D.C. Kuiken |
|
Esperanto | esperanto | eo | (2,2) | IL3 | LPPL | Sergei B. Pokrovsky |
Estonian | estonian | et | (2,3) | EC | LPPL|MIT | Enn Saar |
Ethiopic | ethiopic, amharic, geez | mul-ethi | (1,1) | public-ask | Arthur Reutenauer Mojca Miklavec |
|
Farsi | farsi, persian | fa | (,) | |||
Finnish | finnish | fi | (2,2) | EC | other-free | Kauko Saarinen Fred Karlsson |
French | french, patois, francais | fr | (2,3) | EC | other-free | René Bastian Daniel Flipo Bernard Gaulle |
Friulan | friulan | fur | (2,2) | EC | LPPL | Claudio Beccari |
Galician | galician | gl | (2,2) | EC | LPPL | Javier Múgica |
Georgian | georgian | ka | (1,2) | T8M | LPPL | Levan Shoshiashvili |
German | german | de-1901 | (2,2) | EC | LPPL | Werner Lemberg |
ngerman | de-1996 | (2,2) | EC | LPPL | Werner Lemberg | |
swissgerman | de-ch-1901 | (2,2) | EC | LPPL | Werner Lemberg | |
Greek | monogreek | el-monoton | (1,1) | LPPL | Dimitrios Filippou | |
greek, polygreek | el-polyton | (1,1) | LPPL | Dimitrios Filippou | ||
Gujarati | gujarati | gu | (1,1) | |||
Hindi | hindi | hi | (1,1) | |||
Hungarian | hungarian | hu | (2,2) | EC | MPL 1.1/GPL 2.0/LGPL 2.1 | Bence Nagy |
Icelandic | icelandic | is | (2,2) | EC | LPPL | Jorgen Pind Marteinn Sverrisson Kristinn Gylfason |
Indonesian | indonesian | id | (2,2) | ASCII | GPL | Jörg Knappen Terry Mart |
Interlingua | interlingua | ia | (2,2) | ASCII | LPPL | Peter Kleiweg |
Irish | irish | ga | (2,3) | EC | GPL | Kevin P. Scannell |
Italian | italian | it | (2,2) | ASCII | LPPL | Claudio Beccari |
Kannada | kannada | kn | (1,1) | |||
Kurmanji | kurmanji | kmr | (2,2) | EC | LPPL | Jörg Knappen Medeni Shemdê |
Latin | latin | la | (2,2) | EC | LPPL | Claudio Beccari |
classiclatin | la-x-classic | (2,2) | ASCII | LPPL | Claudio Beccari | |
Latvian | latvian | lv | (2,2) | L7X | LGPL | Janis Vilims |
Lithuanian | lithuanian | lt | (2,2) | L7X | Vytas Statulevičius Yannis Haralambous Sigitas Tolušis |
|
Malayalam | malayalam | ml | (1,1) | |||
Marathi | marathi | mr | (1,1) | |||
Mongolian | mongolian | mn-cyrl | (2,2) | T2A | LPPL | Dorjgotov Batmunkh |
mongolianlmc | mn-cyrl-x-lmc | (2,2) | LMC | |||
Norwegian | bokmal, norwegian, norsk | nb | (2,2) | EC | free | Rune Kleveland Ole Michael Selberg |
nynorsk | nn | (2,2) | EC | |||
Oriya | oriya | or | (1,1) | |||
Panjabi | panjabi | pa | (1,1) | |||
Piedmontese | piedmontese | pms | (2,2) | ASCII | LPPL | Claudio Beccari |
Polish | polish | pl | (2,2) | QX | public | Hanna Kołodziejska Bogusław Jackowski Marek Ryćko |
Portuguese | portuguese, portuges | pt | (2,3) | EC | BSD-3 | Pedro J. de Rezende J. Joao Dias Almeida |
Romanian | romanian | ro | (2,2) | EC | Adrian Rezus | |
Romansh | romansh | rm | (2,2) | ASCII | LPPL | Claudio Beccari |
Russian | russian | ru | (2,2) | T2A | LPPL | Alexander I. Lebedev Werner Lemberg Vladimir Volovich |
Sanskrit | sanskrit | sa | (1,3) | free | Yves Codet | |
Serbian | serbian | sr-latn | (2,2) | EC | LPPL | Dejan Muhamedagić |
serbianc | sh-cyrl | (2,2) | T2A | |||
Slovak | slovak | sk | (2,3) | EC | GPL | Jana Chlebíková |
Slovenian | slovenian, slovene | sl | (2,2) | EC | LPPL | Matjaž Vrečko |
Spanish | spanish, espanol | es | (2,2) | EC | MIT/X11 | Javier Bezos |
Swedish | swedish | sv | (2,2) | EC | LPPL | Jan Michael Rynning |
Tamil | tamil | ta | (1,1) | |||
Telugu | telugu | te | (1,1) | |||
Thai | thai | th | (2,3) | LTH | LPPL | Theppitak Karoonboonyanan |
Turkish | turkish | tr | (2,2) | EC | LPPL | P. A. MacKay H. Turgut Uyar S. Ekin Kocabas Mojca Miklavec |
Turkmen | turkmen | tk | (2,2) | EC | public | Nazar Annagurban |
Ukrainian | ukrainian | uk | (2,2) | T2A | LPPL | Maksym Polyakov Werner Lemberg Vladimir Volovich |
Uppersorbian | uppersorbian | hsb | (2,2) | EC | LPPL | Eduard Werner |
Welsh | welsh | cy | (2,3) | EC | LPPL | Yannis Haralambous |