Treatise on Physiological Optics

A Case Study reported by PlanetDjVu, November 27, 2002

We are referencing here a very interesting conversion of this three-volume treatise by the University of Pennsylvania, located at:

The treatise is presented in three digital versions:

1. As searchable DjVu (color image with hidden text)
2. As PDF with OCR text replacement (suspects left as bitmap)
3. As PDF image-only in color.

The size difference of these three digital versions is remarkable:

Searchable DjVu
35 MB
PDF with OCR Text Replacement
1,005 MB
PDF Image-Only
998 MB

The publisher apologises that the text in the DjVu cannot be copied.  This is no longer true.  For the past year, the DjVu web browser plugin has supported the copying of text.

The publisher also reports that the DjVu is presented "in pieces".  What is actually meant is that the DjVu is presented in INDIRECT format.  The INDIRECT format stores the volume as many single-page files, but it can be downloaded in either INDIRECT or BUNDLED (one file) formats, so it is not really "in pieces".

The DjVu files were created and OCRed with the DjVu Workgroup 3.0 product from LizardTech, which was available from LizardTech (and from PlanetDjVu) until June of this year. The PDF versions were created with Adobe Acrobat Capture, which is still available commercially as version 3.0.4.

As you can see, the PDF with OCR Text Replacement is unsuitable as a dgital format because it distorts and leaves artifacts on the page.  This format only works effectively with bitonal page images, not color page images.

The PDF Image-Only version has color images that look fine, but the lack of searchable text is a disadvantage (along with the enormous size).

This leaves DjVu as the only viable digital format, at just 3% of the size of the PDF!  DjVu is the only practical way to turn color page scans into a seachable digital book.  What is even more remarkable about the DjVu version is that it was created with 400 dpi color scans intead of the normal 300 dpi scans, resulting in higher-quality pages.  You are in for a treat when you open and look at this DjVu rendition.

If you are interested in making your own searchable DjVu books from color scans, please contact us regarding the JRAPublish product, which excels at this conversion task, and may be available (licensed) for release in the future, or you can contact LizardTech concerning their Document Express for DjVu product line. You can also encode trial DjVu pages for free at the Any2DjVu online conversion server.

OCRed Text Comparison

We have reOCRed the three volumes of this treatise with JRAPublish, which uses the ABBYY FineReader OCR engine.  The original OCR layer, created with DjVu Workgroup, used the Expervision OCR engine.  The new OCR layer created with JRAPublish is much more accurate, particularly in the recognition of characters containing umlats. Recognition using JRAPublish was also more accurate because the OCR was performed for both English and German languages, while recognition with DjVu Workgroup was performed for English only. Take a look at the following exerpt and you will see the improvement:


JRAPublish Exerpt from Extracted Text:

4, 5.] §2. Sclerotica and Cornea

The following give good descriptions of the structure of the human eye:

TH. SÖMMERING, Abbildungen des menschlichen Auges. Frankfurt a. M. 1801.In Latin also.
C. F. TH. KRAUSE, Handbuch der menschlichen Anatomie. Hannover 1842. Bd. I, T. II.
S. 511551.Contains also earlier literature on the anatomy of the eye. S. 733-745.
E. BRÜCKE, Anatomische Beschreibung des menschlichen Augapfels. Berlin 1847.
W. BOWMAN, Lectures on the parts concerned in the operations on the eye and on the structure of
the retina and the vitreous humour. London 1849.
A. KÖLLIKEH, Mikroskopische Anatomie oder Gewebelehre des Menschen. Leipzig 1854.
Bd II, S. 605.More recent literature also, S. 734-736.
DUJARDIN, Remarques sur certaines dispositions de Pappareil de la vision chez les insectes.
C. R. XLII, 941. Inst. 1856, 194.

DjVu Workgroup Exerpt from Extracted Text:

4, 5.] §2. Sclerolica and Cornea 5
The following give good descriptions of the structure of the human eye:
Tn. S6MERI6, Abbildungen des menschlichen Auges. Frankfurt a. M. 1801.--In Latin also.
C. F. Tn. KRAUSE, Handbuch der rnenscblicben Analomie. Hannover 1842. Bd. I, T. II.
S. 511--551.--Contains also earlier literature on the anatomy of the eye. S. 733-745.
E. Btt0CKE, Analomiscbe Beschreibung des menschlichen Augapfds. Berlin 1847.
W. BOWMAn, Leclures on lhe paris concerned in the operalions on lbe eye and on lhe slruclure of
the relina and the vitreous umour. London 1849.
A. K6LIE, ,likroskopisclie Analomie oder Gewebelehre des Menschen. Leipzig 1854.
Bd II, S. 605.--:More recent literature also, S. 734-736.
DVJARD, Remarques sur certaines dispositions de l'appareil de la vision chez les insectes.
C. R. XLII, 941. Inst. 1856, 194.











Hosted by uCoz