[tex-k] (fwd) Bug#337805: tetex-bin: dvips should include document's title in Postscript instead of DVI filename

Gernot Salzer salzer at logic.at
Fri Nov 18 12:00:12 CET 2005


> > > Reason: if I publish a paper (e.g., submit the paper), I don't want
> > > the paper to include any private information, like some funny document
> > > name.
>  
> Please run (la)tex --jobname=what_you_want_to_name foo.tex

This doesn't solve the problem.

First, it is wrong from the outset that I have to worry about unforeseen
and unintended information embedded in the final, distributed document.
Such a switch is just a hack to modify that information which shouldn't
be there in the first place.

Second, embedding the information in the document shouldn't be the
default behaviour. Suppressing it should be the default. (The Windows OS
and Outlook could have always been much safer than they were if
MS had chosen other default settings.)

Third, this switch doesn't change other information embedded in the PS file,
like the path of the dvips command. Yes, of course, you can avoid
calling dvips with the complete path with appropriate env variable
settings. But should such considerations really be the normal procedure
for a TeX user who wants to make sure that his document contains only
the information he typeset?

> > I find it kinda weird when a PDF I open has a title like 'ms29paper.doc'
> > or 'godel.dvi'...
> 
> Hmmm, you mean the title bar? It always shows just foo.pdf...

You mean, the title bar shows just the actual file name? Maybe your PDF viewer.
In general it depends
- on the program generating the ps or pdf file
- on the viewer.

Under Linux, PS/PDF documents created via dvips and ps2pdf and viewed with
"gv" show the name embedded by dvips, which is the name of the dvi file,
not the current file name.

pdflatex, on the other hand, doesn't seem to embed information about
the original file; its pdf-files always show the current file name
in the title bar, even with "gv".

So it seems that dvips is the "culprit". it embeds the name
of the original dvi file in three places:
%%Title: what_you_want_to_name.dvi
%DVIPSCommandLine: dvips what_you_want_to_name.dvi
TeXDict begin 39158280 55380996 1000 600 600 (what_you_want_to_name.dvi)

ps2pdf seems to take the last occurrence and embeds the command
/Title(what_you_want_to_name.dvi)>>endobj
in the pdf file.

None of these occurrences of "what_you_want_to_name.dvi" is vital
for displaying the document, so why include them?

> > > But lines like
> > > %%Title: multlog.dvi
> > > %DVIPSCommandLine: /usr/TeX-live/bin/i386-linux/dvips multlog.dvi
> > > still constitute a tiny leak of privacy.
> > 
> > +1 too
> 
> See above. Ahh, what's the real secret!!! Do you know many people browsing
> .ps files? How could it hurt your privacy?

First, you don't have to browse the file, if your viewer displays the
information (see above).

Second, if the document appears in the right context, then it will be
taken apart and scrutinized. My scientific papers probably will not
(though I'm not even sure about that, see below).
But if you want to find out about the origin of a document containing
some "explosive" material, then a "specialist" able to use a
text editor or the "strings" command under Unix will have a look.

And third, you don't need much imagination to construct examples where
these tiny leaks of privacy have unintended consequences.

Example 1: I once refereed a paper for a conference, and since I'm
one of the bad guys looking sometimes into the source (or maybe
because my viewer displayed the original title) I found out about
the original name of the document (a TeX/DVI/PS paper).
It contained the name of another conference.
Using this hint it then was quite easy to reveal that the paper was
a case of self-plagiarism.
Was it really the intention of the authors to give away this information?

Example 2: Two years ago a company here in Austria conducted a survey
on Linux vs. Windows. They claimed to be unbiased and not connected
to any of the relevant companies involved in the OS business.
Unfortunately, the survey (a Word document) contained some hidden traces which
made clear that the author of the survey was an employee of MS Austria.
As little as a filepath containing the home dir of the author (which usually
resembles the name of the author) is needed.

No matter what one thinks about these cases of cheating, I would still call
it a leak of privacy.

Gernot




More information about the tex-k mailing list