Autorotating Pages with JRAPublish 2.0
a report by PlanetDjVu, February 17, 2004
Introduction
JRAPublish 2.0 features a new autorotate option that will rotate pages so that the text lines are horizontal when displayed. This is very useful when your document contains landscape pages that were scanned in portrait orientation. It is often desirable to rotate these pages, after scanning, back to the landscape orientation so that they can be easily read on-line.
Why are landscape pages typically scanned in portrait orientation? The answer is simple. When scanning with a cut sheet document scanner, the pages are first stacked for the input tray of the scanner, and all pages need to have the same orientation for the input tray.
How Autorotation Works
Autorotation can only be performed when OCR is performed. This is because the decision to autorotate a page is based on the page layout analysis that is a part of the OCR process.
For pages that contain both horizontal and vertical text, the page will be rotated to the orientation that makes the most text on the page display horizontally.
Rotated landscape pages are great for screen display, but what about for printing?
When printing a PDF file that contains rotated pages, the rotated pages will print out according to the paper size specified. In other words, a landscape page that displays as landscape on the screen will print in portrait orientation, assuming that portrait is the paper size specified in the print dialog box.
When printing a DjVu file that contains rotated pages, the rotated pages will print out according to the saved orientation. If the default "shrink to fit" option is selected, then the landscape page will be shrunk to fid a portrait piece of paper. "Shrink to fit" prevents the page image from being cut off, but this approach is not as intelligent as Acrobat Print driver.
Does Autorotate always work correctly?
Most of the time, autorotation will work correctly. However, on pages that contain both horizontal and vertical text lines, sometimes the autorotation that is performed is not desired. After performing autorotation, it is recommended that you visually inspect the rotated pages for accuracy.
You can use the JRAPublish log file for checking, as it provides a list of the pages that were autorotated.
How can incorrect rotations be fixed?
Simply open up the DjVu file in DjVu Editor, or the PDF file in Adobe Acrobat, and rotate the pages manually to the desired orientation. Both editors have page rotation features.
I see rotation buttons in the DjVu Plugin and in Acrobat Reader. What are these for?
The rotation buttons in these viewers will rotate pages for display only. The rotation is not saved into the document. This is useful for pages with both horizontal and vertical text. Read the horizontal text, then rotate the page and read the vertical text (saves neck strain!).
Autorotation Case Study: Brantford Engineering Specification
We used a 438-page engineering specification document for this autorotation test, because it contains many pages that are mechanical drawings with both horizontal and vertical text. It was desirable to autorotate many of the mechanical drawings, but not all of them.
22 pages were autorotated, and then we opened the DjVu file in DjVu Editor to make 2 corrections. It was easy to find the rotated pages in the document by reviewing the thumbnails for the document pages. This was actually easier and faster than referring to the log file list of autorotated pages.
This engineering document was also deskewed by JRAPublish 2.0.
Deskew and Autorotation are both new features of JRAPublish 2.0. See the comparison of results by opening the two DjVu versions below:
|