Prepress PDF to DjVU for Newspaper Web Delivery
by James Rile, PlanetDjVu, Janurary 26, 2002

Newspaper Publishers are increasingly using PDF for composition.  Gone are the days of "whiteboarding" and "paste-up".  PDF files are a complete ditigal representation of the paper publication. Once the paper is printed, an opportunity exists to "re-purpose" the PDF files for digital publication on the web.

An Opportunity and a Problem
While PDF paves the way for web publishing, the problem is that the PDF files used for print production are simply too large for effective web publishing, given the current bandwidth constraints for web delivery.  The graphics content of the PDF files must be high-quality for printing, and this is what gives PDF files excessive weight for web delivery.

Addressing the "Graphics Weight" Problem
PDF files can be re-processed using Postscript and Acrobat Distiller to reduce the graphics size, which is a process sometimes referred to as "re-frying" the PDF. There are three methods which can be used:

The graphics can be downsampled - 300 dpi images can be reduced to 72 dpi.
The graphics can be heavily compressed using the JPEG compression of PDF
The entire page size can be reduced from the tabloid size used for printing.

Web-optimized PDF are STILL too large...
Unfortunately for PDF, after applying the above size reduction methods, the PDF files are still too large for effective web delivery, and the graphics quality becomes too degraded.

Conversion of PDF to DjVu "goes the final distance"
PDF files can be digitally converted to DjVu files, and the DjVu format can then bring the file size down to a "web-deliverable" level.  As the chart below demonstrates, a combination of the DjVu format and overall size reduction to letter-size will produce files that are small enough for web-delivery.  A side-benefit of letter-size reduction is that they will easily print on a home printer.

Newspaper - tabloid size (10.75 x 16.5 inches)
printed in color
Prepress PDF with 300 dpi graphics
PDF with 72 dpi graphics using max. compression, reduced to letter-size
DjVu - tabloid size using scan-300 segmentation
DjVu - letter size using scan-300 segmentation
  Didsbury - 20 pages
  Provost - 31 pages
  Red Deer - 20 pages
  Wainwright - 20 pages
     Avg. Page Size
387 Kb
262 Kb
110 Kb
74 Kb

Bonus: Scanned Color Pages are no bigger than Prepress Pages in DjVu
If you have a newspaper issue that is NOT in prepress PDF format, then you must scan the pages of the issue in color in order to produce a digital copy.  Rendering this as color "image + text" PDF will produce an enormous file.  HOWEVER, (here's the good news), rendering this as color "image + text" DjVu will be NO LARGER than the DjVu file produced with prepress PDF.  This means that in a newspaper archive, you can seamlessly merge modern digital issues produced from prepress PDF with legacy digital issues produced from scans.

