\documentclass[final]{ltugboat} \def\beginhtml{\cs{begin}\tubbraced{html}} %\def\tubmakecaptionbox#1#2{#2}% no figure numbers \usepackage{microtype} \usepackage{graphicx} \usepackage{ifpdf} \usepackage[hyphens]{url} \usepackage[hidelinks,pdfa]{hyperref} \usepackage{upquote} %Koch addition % See https://stackoverflow.com/questions/1662037/how-to-write-programming-code-containing-the-character-in-latex \renewcommand{\topfraction}{.9} % don't go to a float page so soon: \renewcommand{\dbltopfraction}{.9} \renewcommand{\bottomfraction}{.7} \renewcommand{\textfraction}{.1} \renewcommand{\floatpagefraction}{.8} \renewcommand{\dblfloatpagefraction}{.8} \def\macOS{mac\acro{OS}} \def\MacTeX{Mac\TeX} \def\BasicTeX{Basic\TeX} %\def\MacTeXAdditions{\texttt{MacTeX-Additions}} \begin{document} \title{Interactive content using \TeX4ht} % repeat info for each author. \author{Richard Koch} \EDITORnoaddress %{2740 Washington St \\ Eugene, Oregon USA} \netaddress{koch (at) math dot uoregon dot edu} \personalURL{http://pages.uoregon.edu/koch/} \maketitle % The abstract comes after \maketitle in ltugboat. \begin{abstract} \TeX4ht converts \LaTeX{} source into web pages. This article explains how to add interactive content to these pages, using \TeX4ht and straightforward copying from web sources. The techniques should work on all computer platforms. Some refinements to the \TeX4ht methods are also discussed. \end{abstract} \section{Introduction} Let me begin with three vignettes. I started attending \tug\ conferences in 2001, and along with expected talks there were a few surprises. In 2005, an expert from England predicted that \TeX\ would survive for four more years and then be replaced. He was teaching in the Open University system where students work remotely, and he wanted to include interactive content in his lectures. I thought the talk was nonsense. Then \acro{COVID} hit. The 2004 Practical \TeX\ conference was held at Fisherman's Wharf in San Francisco, and included a talk by Ernest Prabhakar, an Apple engineer. After that talk, Prabhakar met with Mac users and others including Hans Hagen, all sitting around a large conference table. Hans was trying to convince Apple to allow Java programs to run in their pdf viewer so interactive elements could be added. I sat next to Prabhakar and got to see how he operates. He was fully engaged in the conversation, but simultaneously he was surfing the web\Dash the fastest surfer I have ever seen. Eventually he said to Hans, ``It appears to me that you are the only one in the world writing Java in pdf files.'' I'm one of those users who updates \TeX{} Live daily while drinking my morning coffee. Sometime in 2022 I noticed that \texttt{tex4ht} was on every day's update list. So I wrote the \TeX{} Live mailing list asking that this bug be fixed. To my surprise, I was told that the updates were genuine; Michal Hoftich, who maintains \TeX4ht, makes updates almost daily. %Incidentally, while writing this article I looked back at the %proceedings for this conference. Prabhakar's talk was immediately %followed by a talk by Eitan Gurari, the inventor of \TeX4ht, and %shortly after that there was a talk by Hans Hagen on {\em The pros and %cons of PDF}. A little later Han TheThanh, who wrote pdftex, talked on %micro-typographic extensions of the program. It has taken me 18 years %to digest the talks in that conference! \section{\PDF\ and \HTML\ fifteen years later} The Fisherman's Wharf conference was 18 years ago, and some issues are clearer with the passage of time. Today every computer platform has excellent software to display pdf files, and every computer platform has an up-to-date web browser. It seems clear that pdf is the right format for static documents, and that html is the right format for documents with interactive content. Other formats may emerge, but that will only happen if an activity cannot be supported by pdf or html. (Although pdf has facilities for interactivity, they are rather infrequently used compared to interactive html.) \section{A TeXShop detour} I wrote TeXShop, a front end for \TeX{} on the Macintosh. TeXShop is relevant here only because it explains how I was led to reexamine \TeX4ht. %This August I realized that TeXShop should be able to preview html %files, just as it can already preview pdf files. So I added a web window %to the resources for each document. Typesetting in TeXShop is controlled %by ``engine files'', small shell scripts that users can edit. The latest %TeXShop has two extra commands for use in these scripts. One causes the %script to search the source directory for a pdf file with the same name %as the source and open it in a preview window if found. Another repeats %this action for html files, opening a web window. Typesetting in TeXShop is controlled by ``engine files'', small shell scripts that users can edit which call \TeX\ binaries. After typesetting, an engine searches the source directory for a pdf file with the same name as the source and opens it in a pdf preview window if found. This August I added code which searches for an html file with the same name as the source, and opens it in an active web window if found. These windows are created using the Cocoa programming \API{}s, so they are part of TeXShop rather than external Mac applications like Preview and Safari. With this change, it is easy to ``typeset'' html files. Similarly, it is easy to support Pre\TeX{}t, a project where authors write xml source and then convert the source to pdf, html, and other formats. So it was natural to try \TeX4ht, which accepts a \LaTeX{} source file and outputs html (among other things). TeXShop now has a typesetting engine which typesets the source twice, once with \TeX4ht and once with pdflatex. The \TeX4ht output is opened in a web viewer and the pdf output is opened in a pdf preview. I had seen demonstrations of \TeX4ht given by Eitan Gurari, the original author of \TeX4ht. Indeed at that 2004 conference at Fisherman's Wharf, Prabhakar's talk was immediately followed by a talk by Gurari on \TeX4ht. At the time, \TeX4ht was outputting mathematics using pictures, and the results were a little crude. In the years since then, \MathML\ was invented, and then MathJax was created and provided beautiful rendering of \MathML\ code. \TeX4ht adopted these technologies. I selected a 20-page set of lecture notes, with extensive mathematical equations and many illustrations. The document used \texttt{hyperref}, \texttt{amsmath}, and other packages. I typeset it with \TeX4ht, producing html. Typesetting was fast\Dash and the html output was amazing! The mathematical equations were crisp and clear, the illustrations were fine; to tell the truth, I doubted that I was seeing html. As a test, I resized both windows. The text in the pdf window shrank since the pdf had been configured to ``fit in window''. The text in the html window reflowed. \section{Interactive content} \TeX4ht therefore allows you to convert old and new \LaTeX\ static documents into web documents. But can you add interactive content to these documents? Yes, as this article will demonstrate. Select an old document you have lying around the house. You'll be able to add interaction to it by the end of the next two sections. Don't typeset immediately because a couple of steps are needed. Both are given in these two sections. First we need a method to write source code which will only appear in the html version of the document. The following code does the trick: \begin{verbatim} \ifx\HCode\undefined % source for pdf document \else \fi \end{verbatim} The \cs{HCode} tested here is a command that appears only in \TeX4ht. Some web documents recommend the \texttt{ifpdf} package, but that fails when typesetting with \XeTeX. Next, we need to switch from writing \LaTeX{} code to writing html code which \TeX4ht will insert verbatim into the final document without processing. The following code suffices: \begin{verbatim} \ifx\HCode\undefined % source for pdf document \else Initial words for html document. \begin{html} \end{html} \fi \end{verbatim} Finally we need something interactive. We'll use a piece of SageMath code, which is explained in a later section. Putting all this together, add the following lines to your document, creating a new section in the web version. \begin{verbatim}[\small] \ifx\HCode\undefined \else \section{An Experiment} \begin{html}
This sentence has bold and italic text.
Also math: \(y = \sqrt{x^2 + 1}\) and $$\int_0^\infty e^{-x^2} \ dx = {{\sqrt{\pi}} \over 2}$$
\end{html} \fi \end{verbatim} Typeset and you will see the output below (right margin has been truncated). \centerline{% \includegraphics[width=\hsize]{Graphics/shot8}% } \noindent But how is this possible, since source inside an ``html pair'' is inserted directly in the output without processing? \section{Calling \TeX4ht} Originally \TeX4ht output small pictures for inline and displayed mathematics. Eitan Gurari unexpectedly died in 2009, and \tug\ paid him the ultimate compliment by keeping his program alive. Now it is actively maintained by Michal Hoftich. Due to new developments in \MathML\ and MathJax, there are many ways to call \TeX4ht when it is asked to typeset. Let us concentrate on the three most important methods. Calling \TeX4ht using the call \begin{verbatim} make4ht source.tex "mathml" \end{verbatim} causes \TeX4ht to insert \MathML\ code for inline and display equations. This \MathML\ is then rendered by the browser. Calling \TeX4ht using the call \begin{verbatim} make4ht source.tex "mathml,mathjax" \end{verbatim} causes \TeX4ht to insert \MathML\ code for inline and display equations, but call MathJax to render the resulting code. Calling \TeX4ht using the call \begin{verbatim} make4ht source.tex "mathjax" \end{verbatim} causes \TeX4ht to insert \LaTeX{} code for inline and display equations, and call MathJax to render the resulting code. Note that MathJax can render both \MathML\ and \LaTeX{} code when it discovers equations in an html document. On my computer, mathematical rendering using the first method is not as clear as rendering with the other two methods. Integral signs are too small and there are other minor flaws. %This may depend on the browser and computer platform; I use Safari on a %Macintosh. The first and third methods understand \LaTeX{} input for interactive content, but the second does not. These experiments suggest that the third method is the most desirable for interactive code. %The third method should work on all platforms because MathJax is %platform independent. My initial experiments did not go well with the third method. Inline equations were fine, but displayed equations were rendered with static images. Then one day I tried the alternate \cs{[} notation rather than \texttt{\$\$} and everything worked. I reported this to Michal, and the {\em very next day} he fixed \TeX4ht so both notations are rendered with MathJax. (The \LaTeX\ developers do recommend \cs{[...]}, by the way.) Please update your \TeX{} Live distribution and typeset using the third method. \section{A MathJax perk} By now, perhaps you have typeset your own document with \TeX4ht and MathJax. Select an equation and right click on it. A contextual menu opens offering to copy the equation to the clipboard as either ``MathML'' or ``TeX Commands''. Here's a picture: \smallskip \centerline{% \includegraphics[width=\hsize]{Graphics/shot9}% } \smallskip Select ``TeX Commands'', copy, and paste somewhere else. You will obtain the \LaTeX{} code for the equation. This code can be copied into any other \LaTeX{} source document. This remarkably useful feature comes from MathJax and is not available if you call \TeX4ht using the first method. Moreover, the menu will offer \MathML\ code, but not \LaTeX{} code, if you call \TeX4ht using the second method. % MathML code is wordy and not very useful. But the third method of calling \TeX4ht gives \LaTeX{} code. \section{Installing documents on the server} Suppose you typeset a document named \texttt{Sample} with \TeX4ht and produce \texttt{Sample.html}. How should this file be put on a server? The answer is tricky because \texttt{Sample.html} itself will not contain any images, so any needed image files must be provided separately. Moreover, \TeX4ht generates a support file \texttt{Sample.css}, which is also required. Thus it is convenient to put all illustrations in a folder, named (say) \texttt{Graphics}, and refer to these illustrations in the \LaTeX{} source using the pattern \texttt{Graphics/plot1} (\LaTeX\ will automatically look for usual image extensions, using whichever is found). Then the web server should contain \texttt{Sample.html}, \texttt{Sample.css}, and the \texttt{Graphics} folder. \section{Using the work of other people} Nothing in this document comes from me. When I discovered that \TeX4ht produces completely acceptable web pages, I wondered if it would accept html code and send it unmodified to the html document. I asked Karl Berry, who thought it was possible and asked Michal Hoftich. Michal sent the method described here, but I didn't believe it was sufficiently general. So I started writing a sample document showing that the method could not display math, or handle YouTube videos, or accept Sage code. My sample simply proved the opposite. I do not know a single \MathML\ tag. I knew the American Mathematical Society recommended MathJax, but didn't know why. I don't understand how these technologies work. Several years ago I downloaded Sage. But I didn't know that web pages could access a server so students who had never installed Sage could still read web pages with Sage content. When I realized that, I used Sage to graph simple functions. When it displayed a \acro{3D} graph and let me rotate it interactively, I almost fell off my chair. It is strange that I had to learn these lessons over again, because \LaTeX{} is a crucial tool for me and yet I have never read \TB; \TeX{} macros are crucial for my life and yet I don't know how to write a macro. We can do things in our lives because of the independent work of thousands of people. \section{\PDF\ and \HTML\ in mathematics} When I was a college sophomore, I took an abstract algebra course from W. Wistar Comfort. His lectures were crystal clear; you could copy the board, read the notes at home, and see every step in its proper logical order. Later I took courses with a more rough and tumble atmosphere; the instructor seemed to be inventing right in front of our eyes, and sections of the board would be crossed out when a better idea presented itself. Both lecture styles worked, showing the dual nature of mathematics. To me, pdf is for the final crystalline form of mathematics, and html is for the rough and tumble way it is invented. Euclid is pdf, but Legendre is html, and Euler is both. \advance\signaturewidth by 8pt \makesignature \newpage \appendix \section{Refinements for \TeX4ht} (Everything in this section came from Michal Hoftich, who we asked to review the above.) \subsection{\cs{ifdefined}\cs{HCode}} The main article uses \begin{verbatim}[\small] \ifx\HCode\undefined\else ... \fi \end{verbatim} to insert material only when processing under \TeX4ht. This is fine, and is the general form. But when only the html output needs the extra attention, it can be simplified to: \begin{verbatim}[\small] \ifdefined\HCode ... \fi \end{verbatim} (By the way, \cs{ifdefined} is an \eTeX\ primitive; \LaTeX\ has required \eTeX, and some primitives beyond \eTeX, for years now.) \subsection{\cs{NewDocumentEnvironment}\tubbraced{html}} The main article uses the \tubbraced{html} environment inside \cs{HCode} conditionals, so that only \TeX4ht sees it. This is fine, but it is arguably nicer to define the \tubbraced{html} environment in all cases, and make it a no-op when being processed for pdf (or dvi, but we won't keep mentioning that). Also, we may as well define an analogous environment for material that should only be processed in the pdf case. This can most easily be done using the relatively recent (2020) \cs{NewDocumentEnvironment} command. The following two definitions in the preamble define an \tubbraced{html} environment to ignore its contents (since normally we are running \LaTeX, not \TeX4ht), and the \tubbraced{pdfenv} environment to typeset its contents (for the same reason): \begin{verbatim}[\small] \documentclass{article} \NewDocumentEnvironment{html}{+b}{}{} \NewDocumentEnvironment{pdfenv}{}{}{} \end{verbatim} Then, in a configuration file for \TeX4ht (see next section), we reverse the definitions so that \tubbraced{html} is active and \tubbraced{pdfenv} is a no-op: \begin{verbatim}[\small\hfuzz=6.6pt] % (in a configuration file, see below) \ScriptEnv{html} {\ifvmode\IgnorePar\fi\EndP\NoFonts\hfill\break} {\EndNoFonts} \RenewDocumentEnvironment{pdfenv}{+b}{}{} \end{verbatim} Then the environments can be used without any conditionals. As a side benefit, the environments can be nested. For example: \begin{verbatim}[\small] \begin{document} ... \begin{html}This is output only in HTML, but can include LaTeX math: \( a=b^2 \).
\end{html} \begin{pdfenv} Nested \LaTeX\ not in the HTML output. \end{pdfenv} \begin{html}Then we can have more HTML.
\end{html} \end{verbatim} %\LaTeX{}nicalities. \subsubsection{\cs{NewDocumentEnvironment} explanations} You may be wondering what the \texttt{+b} means in the \cs{NewDocumentEnvironment} call. If you're not wondering, skip this section. The environment name (e.g., \texttt{html}) is the first argument to \cs{NewDocumentEnvironment}. The second argument, with the \texttt{+b}, defines how arguments should be handled. The third and fourth arguments, empty for us, define the code which is run at the beginning and end of the environment, respectively. The \texttt{b} argument specification says to pass the body of the environment as argument \verb+#2+ to the code blocks. (\verb+#1+ is for the optional argument, which we don't use.) The \texttt{+} specifier allows multiple paragraphs within the environment body. Since we don't specify any code to run, nothing is done with the environment body, so it is effectively discarded. On the other hand, when the argument specification is empty, the environment body is processed normally. Many powerful argument specifiers are available, and they can be used when defining either environments or commands. See the \LaTeX\ \texttt{usrguide3} document for details. \subsection{\TeX4ht configuration files} \TeX4ht supports configuration files, which are a convenient way to specify document-wide settings. The environment redefinitions shown above are one example. Here is another example, moving the Sage specifications to the html page header (via \texttt{@HEAD}): \begin{verbatim}[\small] \Configure{@HEAD}{%