[pdftex] pdftex compression -- proposed addition to manual

Ben Crowell crowell01 at lightandmatter.com
Sat Aug 25 21:06:21 CEST 2001


Reinhard Kotucha wrote about JPEGs being further compressible,
and Alan Shutko suggested this was happening because GS
was downsampling. My own experience is based on scanning photos
into a lossless format, then converting to JPEG or PNG with Graphic
Converter, and then using pdftex, which does Flate compression.
When I did this, the PDF output was very close to being the same
size as the original image files, i.e. Flate was not achieving
much further compression. This could mean either that Reinhard
Kotucha's images originated from digital cameras that did less
than optimal JPEG compression, or, as Alan suggests, that the
images got downsampled. The latter interpretation, however,
would not explain the wide variation in compression ratios
observed by Reinhard.

Reinhard wrote:
>If I understand your mail correctly, you want to discourage people to
>apply further compression to jpeg files.  Theoretically, if a
>compression algorithm is ideal, then there shouldn't be any other
>algorithm that is able to further compress that file.  This is
>obviously not true for jpeg.
My intention was simply to explain to people how to get better
compression. There is nothing wrong with doing Flate compression,
since it is lossless and almost never /increases/ the size of the
file.

If it's true that some digital cameras do a suboptimal job of
JPEG compression, then I think image-processing software is
probably the best tool for improving the compression. I wouldn't
just send it through Flate and hope it fixes the problem.
Also, people may want to be able to adjust the level of
JPEG compression. Typically for a PDF destined for internet
distribution, the level of compression you want is /much/ more
harsh than what you'd get from a digital camera.

On a different topic relating to compression, Otfried Cheong
e-mailed me off-list, saying that when pdftex reads a PNG
or TIFF image as an input, it unpacks it into an uncompressed
pixel array, and then applies no further compression except for
Flate. This would seem to conflict with the information
George N. White III gave in a previous post, or maybe their
statements are really not contradictory and there's something I
don't understand. (Otfried, I apologize if I'm misrepresenting
what you said, but when I tried to reply to your e-mail off
list and ask you to post it to the list, my e-mail bounced back.)

George also wrote on the list:
>Unfortunately, this subject goes far beyond what can reasonably be
>included in a pdftex FAQ.
It seems to me that the page or so of text
we've been working on in this thread is about the right amount
to go in the pdftex manual.
I think there ought to be at least /some/ information on compression
in the pdftex manual, even if it's just a pointer to where to
get more details.



More information about the pdftex mailing list