"Annual Report" Comparision Demystified

A report by James Rile, PlanetDjVu, June 30, 2002

The LizardTech Website has long offered an "Annual Report" comparison between PDF and DjVu.  This comparision is still on-line at: http://www.lizardtech.com/solutions/document/samples/.

Here is the description of this comparison on the LizardTech web page:

Annual Report

Check out this 110 page, full color annual report. As a PDF file, it maxes out at a massive, portal-clogging 147Mb. As a DjVu document, it's a mere 2.87Mb. Download our free, Web Browser plug-in and compare.

DjVu 2.87Mb Click here and get ready to read.

PDF 147Mb Click here to download, then get a cup of coffee,

walk the dog, watch Gone With The Wind . . .

Now, I have often wondered how such a dramatic reduction in size was possible when in all my tests, the best I can do is a 4:1 or 5:1 file size reduction with DjVu. I decided to download that large PDF file and find out.

Upon checking the PDF document properties, I can see that this PDF file was created with Acrobat Capture 3.0 from color scans, and was saved in the PDF Normal format with word-recognition suspects left as bitmap.  Also, this PDF file was not optimized (saved with the "fast web view" option).  Because the file was not optimized for web delivery, this means that when you click the PDF link on the the LizardTech website, all 147 Mb must download before the first page is displayed.  If you are on a dial-up connection, you could indeed watch a movie before this PDF file is displayed in your web browser!

Optimizing a PDF file is as easy as selecting "File, Save As" in Acrobat, and when I did so, the file size dropped from 147 Mb to 110 Mb.  Now as soon as page 1 downloads, it will display while the remaining pages download in the background.  Barely enough time for a cup of coffee, even with a dial-up connection.

Experienced users of Acrobat Capture 3.0 know that this product generates PDF files using a custom version of PDFWriter, and the resulting files, particularly with color pages, are much larger than they need to be.  The remedy to this is to "re-fry" the PDF, that is, to reprocess the PDF file using Adobe Acrobat Distiller.  So I re-fried my 110 Mb PDF file, and as a result, the file size dropped to just 15 Mb.  There was no loss of quality in this transformation, it is simply a matter of removing unnecessary "meta-information" from the PDF file that Acrobat Capture 3.0 initially puts there.

An even better remedy for this problem is to process the large PDF file in the new CVista PDFCompressor 2.0.  This application was used to recompress all the bitonal objects of the file using JBIG-2 compression, and to further compress the grayscale and color objects by another 50%.  This was a much more efficient method than the two-step approach of converting PDF to Postscript and then back to PDF again.  The result of this compression processing is that the file size dropped to just 21.8 MB.  

Now we have completed the PDF conversion process which started with Acrobat Capture 3.0 and have prepared the PDF for web delivery.

The DjVu file from the LizardTech web site is 2.87 Mb in INDIRECT format, and 3.01 Mb in BUNDLED format.

Let's take a look at this comparision now, "demystified".

DjVu    2.87 Mb      Click here and get ready to read

PDF  14.96 Mb      (Distiller) Click here and get ready to read

PDF  21.80 Mb     (CVision) Click here and get ready to read

Don't walk away for either format!

The DjVu file is 5 times smaller than the PDF (not 51 times smaller)!

This reduction in file size and the superior display-speed performance is still enough to seriously consider DjVu as a better alternative to PDF for color documents.  If not DjVu, then seriously consider PDF compression technology from CVision, or the "re-fry" method of file size reduction.

