[texhax] 365 KB .tex file makes 5.95GB .pdf

Reinhard Kotucha reinhard.kotucha at web.de
Fri Oct 31 00:56:22 CET 2014


On 2014-10-30 at 21:08:41 +0000, Philip Taylor wrote:

 > Dan Luecking wrote:
 > 
 > > Pdfinfo is part of the wintools package in my TeXLive
 > > distribution, so I assume it must be a standard tool for
 > > Unix-based OSs.
 > 
 > It is also included in my TeX Live 2014 installation, which has
 > nothing whatsoever to do with anything Unix-based, so it may well
 > be a standard tool for the whole of TeX Live.

pdfinfo is part of Derek Noonberg's Xpdf viewer.  As the name implies
it requires X11 and thus cannot be compiled on Windows.  Xpdf provides
a few commandline tools which don't need X11.  These can be compiled
on Windows and are part of TeX Live:

  pdffonts
  pdfimages
  pdfinfo
  pdftops
  pdftotext

Maybe it's interesting to know that Derek's PDF parser is used by
TeXworks and pdftex.  You might wonder why pdftex needs a PDF parser.
It's supposed to _create_ PDF.  Consider \pdfximage.  ;)  


Regarding pdfinfo:

 > It seems to return useful information :
 >
 > > Creator:        XeTeX output 2014.10.28:2122
 > > Producer:       xdvipdfmx (20140317)
 > > CreationDate:   10/28/14 21:23:41
 > > Tagged:         no
 > > Form:           none
 > > [...]

You can make it even much more useful if you add the following stuff
to your TeX file:

 \pdfinfo {
   /Title      (An Example Document)
   /Author     (Philip Taylor)
   /Subject    (Just an example)
   /Keywords   (foo, bar, baz)
 }

Why?  Because if these entries exist (/Title is the most important
one) it's very easy to throw all PDF files into a directory and to
write a script which creates an HTML file with links to all the PDFs,
for example.

LaTeX users should use hyperref.  The nasty thing is that the pdfinfo
dictionary insists on either Adobe DocEncoding (a stupid 8-bit
encoding) or UTF-16.  Hyperref takes this into account.  Plain TeX
users can't use non-ASCII characters by default.

After all, the programs mentioned above are extremely useful.  It's
worth to read the documentation.  It's a pity that most users don't
know where these tools come from.  I honor Derek's work very much.

Regards,
  Reinhard

-- 
------------------------------------------------------------------
Reinhard Kotucha                            Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                    mailto:reinhard.kotucha at web.de
------------------------------------------------------------------


More information about the texhax mailing list