Bug in 3.141592653-2.6-0.999995 (TeX Live 2023) with fontspec and tabularray?

Ross Alexander evilross at yahoo.co.uk
Tue Apr 2 18:14:35 CEST 2024

 Hello Karl et al,
I've moved to texlive-20240312 source tarball and have taken another look.  I was wrong about the rounding function itself so started taking a look in xetex for where sum(round()) != round(sum()).
I found a place where I think this effective happens (or can happen) in the function measure_native_node (in XeTeX_ext.c) around line 2100.  Below is the code in question and my (very hacky) changes.

            uint32_t advances_fixed = 0; // Sum rounded advances

            if (totalGlyphCount > 0) {
                int i;

                glyph_info = xcalloc(totalGlyphCount, native_glyph_info_size);
                locations = (FixedPoint*)glyph_info;
                glyphIDs = (uint16_t*)(locations + totalGlyphCount);
                glyphAdvances = (Fixed*) xcalloc(totalGlyphCount, sizeof(Fixed));
                for (i = 0; i < totalGlyphCount; ++i) {
                    glyphIDs[i] = glyphs[i];
                    glyphAdvances[i] = D2Fix(advances[i]);
                    locations[i].x = D2Fix(positions[i].x);
                    locations[i].y = D2Fix(positions[i].y);
                    advances_fixed += D2Fix(advances[i]); // Sum the advances to get the width
                width = positions[totalGlyphCount].x;
            printf("%d %d\n", advances_fixed, D2Fix(width));

            //            node_width(node) = D2Fix(width);

            node_width(node) = advances_fixed; // Use summed advances rather than rounded sum

The printf does show that differences between rounding the final position (positions[totalGlyphCount].x) and sum(glyphAdvances) does occur.  With the change your example (see below), using xetex (This is XeTeX, Version 3.141592653-2.6-0.999996 (TeX Live 2024) (preloaded format=xetex)) no longer breaks the boxes.

I would be good if somebody can sanity check my logic and confirm (or deny) if this on the right track.  If so the BIDI_MIXED case will also need to be reviewed.

  \vbox{\hsize=\wd0 \leftskip0pt plus 1fil  #1} % emulate \raggedleft etc.
  \vbox{\hsize=\wd0 \rightskip0pt plus 1fil #1}
  \vbox{\hsize=\wd0 \leftskip0pt plus 1fil \rightskip\leftskip #1}

\font\1="[lmroman10-regular]" at 10pt \1
\test{Dest-Addr} % ok

\font\1="[lmroman10-regular]" at 10.95pt \1

\test{Dest-Addr} % unwanted line breaks right after hyphen char
% none of 10.94pt nor 10.96pt can reproduce the problem

\font\1="[lmroman10-regular]" at 11pt \1
\test{Dest-Addr} % ok


Regards and thank you for help,Ross

    On Monday, 12 February 2024 at 17:04:16 GMT, Karl Berry <karl at freefriends.org> wrote:  
 Ross and all - regarding the rounding vs. truncation question in XeTeX

The answer does not seem so simple to me. The bottom line is that I'd
like to understand specifically why those minutely different values
cause different breaks. I think something is different "downstream" from
the D2Fix computation. But I don't have the time or energy to follow up,
sorry; hope you do, or someone does.

Details ...

Agreed that changing XeTeX's D2Fixed to do truncation fixes this
particular line break difference. I could reproduce that.

However, I don't think the rounding is an error, per se, because the
other engines do rounding. Citing Thanh about pdfTeX:

thanh> pdftex uses ext_xn_over_d (defined as extxnoverd in utils.c) for
thanh> calculations that involve real numbers. It does rounding, not 
thanh> truncating.

LuaTeX also rounds (just) like XeTeX, at least in some places:
scarso> #define lua_roundnumber(a,b)  (int)floor((double)lua_tonumber(a,b)+0.5)

So just forcing truncation does not feel right. It fixes this particular
xetex+10.95pt case, but who knows what other changes will ensue?  We
could add something like a \XeTeXfloatconversion={1,0} switch so that a
document can choose, but ... before we blindly make such a broad change,
I think the underlying question is what happens with the line breaking
after truncation and why the different outcomes are happening in the
first place.


Additionally: as Nelson pointed out, there is an error in XeTeX's and
LuaTeX's rounding of simply adding 0.5; with negative numbers, that will
not round to the nearest integer (it's necessary to subtract 0.5, not
add, when the argument is negative). pdfTeX does not have this error;
the pdftex code looks like (essentially):
    if (r > DBL_EPSILON) r += 0.5;
    else r -= 0.5;
    return (scaled) r;

Unfortunately, when I changed D2Fix to round negative numbers correctly,
it made no difference in the Dest-Addr example. Not surprisingly, since
almost all numbers coming from fonts are positive. So that is not the
answer in practice.


Additionally: the engine comparisons are problematic.

1) pdftex is using cmr10.tfm(+pfb) for the font. With traditional TeX
line breaking, floating point is not involved at all.

2) With XeTeX, on the other hand, the test document is reading
lmroman10-regular.otf, a completely different font rendered in a
completely different way. One immediate question is whether the metrics
of lmroman10-regular.otf match *exactly* the metrics of cmr10.tfm.
And whether otf vs. tfm makes a difference wrt hyphenation.

3) With LuaTeX, lmroman10-regular.otf is also being read, so XeTeX and
LuaTeX at least have that basis of comparison. However, there are many
differences in how luatex and xetex operate.  E.g., for one thing,
LuaTeX gets the "bad" Dest-<linebreak>Addr output with many font sizes,
I tried from 10.90 to 10.99 and they all broke after the hyphen, unlike
XeTeX where (as you noted) it has to be 10.95 exactly.

3b) With LuaTeX, in principle there is also the question of whether the
"native" rendering is used (luatex) or harfbuzz (luahbtex). However, I
found no difference in the test document either way.

Another possible difference is that XeTeX might "snap" to the limited
number of heights/depths/widths available in tfm format, even when using
otf? Not sure. LuaTeX does not do this.

Although I can't find it now, I seem to recall that Ulrike(?) wrote a
TUGboat article about differences between LuaTeX and XeTeX, e.g., LuaTeX
computing a lot more in floating point before converting to fixed, and
plenty of computations happen in Lua.

Finally, in case anyone does want to take this further: I tweaked the
test document to run under -ini, since it's much easier to experiment
with engine changes without having to bother building a .fmt. Also added
a conditional so it would run with both xetex and luatex. Appended below
and also posted to the xetex bug report.


% https://sourceforge.net/p/xetex/bugs/185/

% ini doesn't work without setting more line breaking params.
\catcode`\{=1 \catcode`\}=2 \catcode`\#=6 
\hsize=6.5in \vsize=9in
\parfillskip=0pt plus1fil
\lefthyphenmin=2 \righthyphenmin=3 % disallow x- or -xx breaks


% maxdimen errors  \showboxbreadth\maxdimen\showboxdepth\maxdimen}

  \vbox{\hsize=\wd0 \leftskip0pt plus 1fil  #1} % emulate \raggedleft etc.
  \vbox{\hsize=\wd0 \rightskip0pt plus 1fil #1}
  \vbox{\hsize=\wd0 \leftskip0pt plus 1fil \rightskip\leftskip #1}

%\font\1="[lmroman10-regular]" at 10pt \1
%\test{Dest-Addr} % ok

  \let\dump\relax \input luatex.ini % so we can:
  \input luaotfload.sty
  at 10.90pt
\test{Dest-Addr} % unwanted line breaks right after hyphen char
% none of 10.94pt nor 10.96pt can reproduce the problem with xetex.

%\font\1="[lmroman10-regular]" at 11pt \1
%\test{Dest-Addr} % ok

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/tex-live/attachments/20240402/f6a4d312/attachment.htm>

More information about the tex-live mailing list.