[tex-live] encTeX, 8bit encoding, ...

Petr Olsak petr at olsak.net
Wed Mar 24 09:44:12 CET 2004



On Tue, 23 Mar 2004, Olaf Weber wrote:

> At the risk of restarting one of the recurrent flamewars: For the work
> on web2c 7.5.3 I've (finally) merged encTeX in my source tree (for

Thank you.

It would be useful to make some encTeX tables for various encodings.
Examples are in the files: utf8-t1.tex, utf8-csf.tex,
1250-t1.tex etc. in ftp://math.feld.cvut.cz/pub/olsak/enctex/enctex.tar.gz

The files mentioned above would be the base part of encTeX accessory.

> various reasons I didn't simply apply Petr's patch) and now I'm pretty
> close to the point where final changes can be made or left for the
> next round.
>
> The upcoming web2c 7.5.3 handles the input encoding issue is handled
> as follows:
>
> - If a TCX file is specified it is read early in the startup sequence.
>
> - An encTeX-enabled format contains xord/xchr/xprn data.  If a TCX
>   file was specified, this data will be ignored (with a message) in
>   favour of the already-read contents of the TCX file.
>
> - On output using one of the \write operators, the xprn array is used
>   to determine whether the ^^-notation is to be used.

OK. The encTeX <--> TCX co-operation would be follow exactly the encTeX
documentation:

-----------------------------
|
| 1.3.
|
| If your web2c distribution implements enc\TeX{} then you can
| initialize it by the "-enc" option in command line.
| You have to use this option during ini\TeX{} because enc\TeX{} stores
| its primitives and its data to the format file. When the format is
| used, the enc\TeX{} is initialized from format file automatically and
| you need not use the "-enc" option again. If you are using a format
| without enc\TeX{} initialized in it and you write "-enc" option then
| the warning is printed and this option is ignored.
|
| The TCX tables ("-translate-file" option) are working with the same
| xord and xchr vector as enc\TeX{} in web2c distribution. This implies
| the following little conflicts: If enc\TeX{} is used together with TCX
| table then TCX table may re-write the initial values of "\xordcode",
| "\xchrcode" and "\xprncode". These initial values are documented in
| section~2.2. If these values are stored in format by enc\TeX{} and TCX
| table is used together with such format then the values from format
| can be re-written by TCX table too. On the other hand, you can use the
| "\xordcode", "\xchrcode" and "\xprncode" primitives for reading or
| saving of these values after TCX table initialization without
| problems.
|
--------------------------------

> - On output to the log or terminal, the locale-dependent C isprint()
>   function is used.

It would be more better to disable locale-dependnce if encTeX format is
used. The encTeX documentation writes:

--------------------------------
|
| 2.1.
|
| All text outputs from \TeX{} to terminal, log file and files managed
| by "\write" primitive are filtered by xchr vector and by
| ``printability'' feature of the character. ...
|
---------------------------------

It means, there is no difference between log+terminal and \write files.

Moreover, there are encTeX primitive "\mubytelog" which controls the
one-byte to multi-byte conversions (to utf8 terminal for example)
of log and terminal output. The isprint() function can be in conflict
with this encTeX setting.

> Variations that could be implemented instead:
>
> - Disable the locale-dependent code, and always use xprn.

I vote disable it. For example, I was very confused (as an user of
TeXlive) why the CSTRIP test fails only on some terminals (the same tex
binary was used on all terminals). I wish to avoid this confusion for
another users.

> - Always store xord/xchr/xprn in the format, not just for encTeX.

I mean that this is not needed. This feature changes behaviour of TCX
tables without encTeX and this can bring a confusion for TCX users.

> - A "genuine" -8bit switch that installs the identity xord/xchr
>   mapping and makes everything printable.
>
> The last is effectively a shorthand for --translate-file=8bit.tcx
> where where 8bit.tcx implements this mapping (cp8bit.tcx does less
> than its name promises).

May be.

Best regards

Petr Olsak



More information about the tex-live mailing list