[texhax] referred to TUG as a source of expertise on math-related character encoding/rendering
dan at fairness.com
Thu Jun 17 00:14:57 CEST 2010
Jim Hafner (IBM Almaden Research) suggested that someone at TUG would likely have the technical expertise to help us with a software problem we're having translating certain math symbols from Microsoft Word into HTML. Most symbols and all foreign character sets we've tried work fine, but certain Greek letters and oversize parentheses (e.g. for multi-line equations) and ellipses have been the culprits so far...
We aren't using TeX, as most of the documents we're doing are natural language texts... but I do however have a semi-famous 1993 interview with Don Knuth to my credit <http://tex.loria.fr/historique/interviews/knuth-clb1993.html>, and I'm hoping that that good karma + Fairness.com being the beneficiary may be enough for someone to take a quick look and maybe recognize some known issue we're running into.
Below is a write-up of the problem... I'm sending some of the relevant files and screenshots in a separate email to avoid spam filter problems.
Thanks in advance!
President, Fairness.com LLC
Email: dan at fairness.com
> Problem in a nutshell:
> We're having a technical problem re: character encoding problem that I'm hoping someone there might have a quick idea or suggestion about.
> My developer is reasonably knowledgeable about Unicode, and NowComment has until now been able to render every character (including foreign language character sets, both Romance and non-Romance) in any Word document and render them in HTML... but some Greek letters used as math/engineering symbols, and also an ellipsis and multi-line parentheses, are not rendering properly in some technical documents.
> Before we invest precious time in an area we're not real familiar with we thought we'd try to find some people who have experience rendering math/scientific symbols and see if they can point us in a good direction.
> A UVA professor we're collaborating with on the software sent us this article excerpt (and added comments below it):
>>>> On May 16, 2007, George wrote:
>>>>> I have a problem for a long time now and I would be grateful if
>>>>> someone knows any solution.
>>>>> I have installed the Greek fonts in Windows XP and I can see for
>>>>> example filenames & directories in Greek, write documents in Greek
>>>>> But sometimes some programs do not display letters in Greek but in
>>>>> funny characters.
>>>>> One of them was winzip which could not keep the original Greek
>>>>> filenames & directories.
>>>>> More importantly Outlook has quite often the same problem. When I try
>>>>> to create a new contact for example, I cut and paste Greek names into
>>>>> the contact name & address field. The pasted text is not the original
>>>>> but again funny characters. I have found that if I paste the copied
>>>>> greek text in the notes field of the contact area (big empty box where
>>>>> one could write comments) then the greek text is pasted fine. If I
>>>>> then copy that pasted text from the notes field into the name field of
>>>>> the contact it is pasted OK with out funny characters.
>>>>> What is the problem?
> UVA prof's comments:
>>>> Some of your applications may be Unicode-savvy, some not, and there is a
>>>> dependency on the font itself. For example, good-old Symbol, which has been
>>>> with us for years, ships in Type 1, TrueType, OpenType, and dfont. If I am
>>>> using Type 1 and you are using OpenType, and you send me a Word doc with,
>>>> say, a bullet, I will see a little infinity symbol. If I turn off Type 1
>>>> Symbol and enable, say, OpenType or dfont, I will see the proper bullet.
> My developer said re: the above " I didn't know the same font had different incarnations."
> **** Is this likely the problem we're having, or might there be something else going on??
> In a separate email I'm sending some sample screenshots of the problems that generated the advice above; first are two samples, each of which have "before and after" images showing both our software's rendering and the original text in Word.
> Again, thank you very much!!
Dan Doernberg, President
Email: dan at fairness.com
"Life isn't fair... but we're working on it."®
Turning Documents into ConversationsSM
More information about the texhax