[tex-live] encTeX, 8bit encoding, ...

Olaf Weber olaf at infovore.xs4all.nl
Wed Mar 24 17:41:32 CET 2004


Petr Olsak writes:
> On Tue, 23 Mar 2004, Olaf Weber wrote:

>> At the risk of restarting one of the recurrent flamewars: For the work
>> on web2c 7.5.3 I've (finally) merged encTeX in my source tree (for

> Thank you.

> It would be useful to make some encTeX tables for various encodings.
> Examples are in the files: utf8-t1.tex, utf8-csf.tex,
> 1250-t1.tex etc. in ftp://math.feld.cvut.cz/pub/olsak/enctex/enctex.tar.gz

> The files mentioned above would be the base part of encTeX accessory.

That part of the packaging I'm happy to delegate to Sebastian and
Thomas.

>> various reasons I didn't simply apply Petr's patch) and now I'm pretty
>> close to the point where final changes can be made or left for the
>> next round.

>> The upcoming web2c 7.5.3 handles the input encoding issue is handled
>> as follows:

>> - If a TCX file is specified it is read early in the startup sequence.

>> - An encTeX-enabled format contains xord/xchr/xprn data.  If a TCX
>>   file was specified, this data will be ignored (with a message) in
>>   favour of the already-read contents of the TCX file.
>> 
>> - On output using one of the \write operators, the xprn array is used
>>   to determine whether the ^^-notation is to be used.

> OK. The encTeX <--> TCX co-operation would be follow exactly the encTeX
> documentation:

[...]

The present code does effectively work like that.  (The actual
implementation is that the TCX is read first, and if a TCX was read
then the xord/xchr/xprn vectors from the format are dropped).

>> - On output to the log or terminal, the locale-dependent C isprint()
>>   function is used.

> It would be more better to disable locale-dependnce if encTeX format is
> used. The encTeX documentation writes:

[...]

> It means, there is no difference between log+terminal and \write files.

At present the encTeX documentation is wrong as far as web2c and
log/term writes are concerned.

> Moreover, there are encTeX primitive "\mubytelog" which controls the
> one-byte to multi-byte conversions (to utf8 terminal for example)
> of log and terminal output. The isprint() function can be in conflict
> with this encTeX setting.

Now that's an interesting case.

>> Variations that could be implemented instead:

>> - Disable the locale-dependent code, and always use xprn.

> I vote disable it. For example, I was very confused (as an user of
> TeXlive) why the CSTRIP test fails only on some terminals (the same tex
> binary was used on all terminals). I wish to avoid this confusion for
> another users.

Vote noted.  I'd like to collect more votes before making the final
decision.

>> - Always store xord/xchr/xprn in the format, not just for encTeX.

> I mean that this is not needed. This feature changes behaviour of TCX
> tables without encTeX and this can bring a confusion for TCX users.

It would allow (say) Hans Hagen to put an 8bit-passthrough xprn vector
in his context formats.

I agree that this kind of thing does provide people with plenty rope.

>> - A "genuine" -8bit switch that installs the identity xord/xchr
>>   mapping and makes everything printable.

>> The last is effectively a shorthand for --translate-file=8bit.tcx
>> where where 8bit.tcx implements this mapping (cp8bit.tcx does less
>> than its name promises).

Note that one reason I dislike this (somewhat) is that it is yet
another switch where the interaction with TCX and encTeX has to be
determined and spelled out.

> May be.

> Best regards

> Petr Olsak

-- 
Olaf Weber

               (This space left blank for technical reasons.)



More information about the tex-live mailing list