[tex4ht] macro containing a Unicode character

Alexandre Roberts alexandre.roberts at gmail.com
Sun Jul 24 08:45:23 CEST 2016


Dear Michal,

Thank you very much for this solution! It sounds like exactly what I need.

I am eager to try it, but unfortunately when I tried to install helpers4ht
(on Mac OS 10.11.5) by following your instructions (
https://github.com/michal-h21/helpers4ht), the command `git clone
git at github.com:michal-h21/helpers4ht.git` returned an error (without asking
me for a password):

     Cloning into 'helpers4ht'...

     Permission denied (publickey).

     fatal: Could not read from remote repository.
Do you happen to know what I'm doing wrong? (Sorry for such a basic
question.)

Best,
Alex

On Sat, Jul 23, 2016 at 11:56 PM, Michal Hoftich <michal.h21 at gmail.com>
wrote:

> Dear Alex,
>
> >
> > I would like to produce an ODT document from my XeLaTeX document (using
> MacTeX
> > 2016).
> >
> > The necessary code to include Unicode characters (including in Greek and
> Arabic
> > script) was kindly provided by CV Radhakrishnan and Michal Hoftich back
> in
> > February 2013. But I am running into a new difficulty: converting a
> document
> > that defines LaTeX macros that have Unicode characters in them. (The
> reason I
> > want this is to enable me to use macros within a Right-to-Left script,
> Arabic.
> > Mixing up RTL and LTR scripts in a text editor, especially when
> punctuation --
> > or braces {} -- is involved, tends to make the source file unreadable.)
> >
> > I am attaching a MWE in two files:
> >
> > 1. `main.tex`: standalone file that includes macro definition
> > 2. `utf2ent.pl`: the Perl script devised by CVR to keep Unicode in the
> new
> > document
> >
> > The script I run to compile this is:
> >
> >      # CVR's script to preserve Unicode characters
> >      perl utf2ent.pl main.tex > main-ent.tex
> >
> >      # tex4ht
> >      mk4ht oolatex main-ent "xhtml, charset=utf-8"  -utf8
> >
>
> There are two problems:
>
> 1. Macros with Unicode names are supported only by Unicode engines, ie.
> XeTeX and LuaTeX. mk4ht oolatex is 8-bit pdflatex, so it can't really
> support it.
>
> 2. utf2ent converts all Unicode characters to entities, including your
> command, so you end with something like '\\entity{1589}' in your code.
>
> 3. $\langle$ and $\rangle$ produces wrong mathml code, see
>
> https://puszcza.gnu.org.ua/bugs/?278
>
> ODT format uses mathml, so it may produce invalid file.
>
> Now what can be done:
>
> You need to use Unicode engine. That means LuaTeX at the moment, as
> XeTeX support is broken in tex4ht at the moment. Fortunately, you can
> use XeTeX to produce the PDF and only modify some macros for tex4ht.
>
> With LuaTeX, it is possible to keep Unicode characters without need to
> call external scripts to convert them to Unicode entities. See
>
> http://michal-h21.github.io/samples/helpers4ht/fontspec.html
>
> for more details. I've modified your file to use alternative4ht and to
> fix the problem with angles. Two new macros are introduced:     extlangle
> and     extrangle, which are redefined in the config file to use XML
> entities directly, instead of math mode.
>
> I've also found a problem that the angles are wrongly swapped in the ODT
> and HTML, probably it is because they use the BIDI algorithm, so they
> don't expect that they are swapped by the user already (you use
> \rangle#1\langle). I've redefined the commands for angles in the config
> file to use the opposite side than should be used according to the name,
> so they are rendered correctly.
>
> The last problem is that mk4ht doesn't support LuaTeX, so you need to
> use different way to compile the document. You can use:
>
> make4ht -ulm draft -c hello.cfg main.tex "xhtml,ooffice" "ooffice/!
> -cmozhtf -utf8" " -cooxtpipes -coo"
>
> (it might be best to save it as a script, as it is not really human
> friendly command call :)
>
> Modified main.tex and hello.cfg are attached. main.tex can be compiled
> with xelatex to PDF, all needed changes for tex4ht are in the hello.cfg
> file.
>
> Best regards,
> Michal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/tex4ht/attachments/20160724/fe1589a0/attachment-0001.html>


More information about the tex4ht mailing list