Re: MathML

• To: Multiple recipients of list LATEX-L <LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE>
• Subject: Re: MathML
• From: Hans Aberg <haberg@MATEMATIK.SU.SE>
• Date: Wed, 8 Oct 1997 14:02:08 +0200
• Reply-To: Mailing list for the LaTeX3 project <LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE>
• Sender: Mailing list for the LaTeX3 project <LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE>

At 11:47 +0200 97/10/08, Thierry Bouche wrote:
>Concernant « Re: MathML », Ulrik Vieth schreibt :
>> Careless coding tends to be mostly
>> visual, but there's at least a potential to make it more semantic
>> by using high-level macros to encode symbols by their function
>> and having them translated to low-level macros in the background.
..
>yes yes yes !

This the other aspect, making LaTeX prepare input that is more semantic,
and which I think may have a relation to MathML, such efforts: Eventually,
in the future, one should be able to have all mathematic semantic
information in the input formulas, but I do not think TeX or LaTeX suitable
for a preparation for this task. -- Computer technology has not developed
sufficiently yet to support it.

...>tex (as a program) is not always coherent with tex (as a
>coding scheme) though... {,} making the comma mathord (there should be
>at least 2 commas in maths, one mathpunct, one mathord); in
>\Sum_{i=1}^n, {i=1} is _not_ an index nor n an exponant,
>mathematically speaking. My feeling is that the Knuthian macros play
>like a virtuose with tex's font oddities and abilities, at the expense
>of the genericity of the markup (too much clerveness, too many special
>cases are not good for genericity...).

I would define a formula as "semantics expressed in symbols": So it is
possible to extract the semantics information from a formula, as opposed to
an illustration. In the attempts achieving the goal of having formulas as
input, the exact syntax makes little difference, because one can later
develop automated tools translating to another syntax.

So, in the example, $\Sum_{i=1}^n$, one should be able to somehow extract
that $i$ an index ranging from $1$ to $n$; additional rendering information
should be added independently of this semantic information. There are two
ways we could do this:

First, we could ignore the TeX syntax, saying that we have an automated
tool that knows how to extract the semantic information from expressions of
the type $\Sum_{i=a}^b$. The problem with this approach is that authors
would find ways to write formulas that break this scheme, for example
$\Sum^b_{i=a}$. We could try to cover up that possibility, but then the
authors will invent something else, and so on.

The second approach is to make use of the TeX syntax, forcing the authors
entering the semantic information. For example, we could define
$\def\Sum#1#2#3#4{...}$, with the usage #1 = i, #2 = a, #3 = b, #4 =
summand. The problem with this approach is that TeX's syntax is too
limited, so authors would probably feel to be in straight-jacket: We do not
grammars.

But one could play with ideas of how to implement with ideas of improved
formula input: In the example above, suppose we decide that an author
always should input sums as $\def\Sum#1#2#3#4{...}$ (ignoring for a moment
the fact that this is mathematically too restrictive). Then the typesetter
would end up with a formula
$$% gory stuff \Sum{gory i}{gory a}{gory b}{gory summand} % more gory stuff$$
The typesetter needs to somehow add rendering information that does not
disturb the semantic information.

Then, the syntax should perhaps be

\rendering{...}$$% gory stuff \Sum{gory i}{gory a}{gory b}{gory summand} % more gory stuff$$
putting in the rendering info separately.

Well, I did not say it is going to be easy. :-)

Hans Aberg
* AMS member: Listing <http://www.ams.org/cml/>
* Email: Hans Aberg <haberg@member.ams.org>