Вернуться к разделу "Материалы по сканированию и оцифровке бумажных книг".
Виртуальный DjVu-принтер (VPD) предназначен для печати любого документа в формат DjVu.
Внутренний DjVu-кодировщик, на базе которого работает VPD, имеет некоторые уникальные особенности - не встречающиеся больше ни в каких других DjVu-кодировщиках. В частности, в VPD применяется особенная схема сегментации (при кодировании в DjVu).
Замечено, что VPD гораздо лучше кодирует в DjVu малоцветные картинки плакатного типа (Color Line Art), чем documenttodjvu. В частности, на этой особенности VPD построена работа метода Minor'а.
Выяснилось, что при кодировании по методу Minor'а VPD работает в точности так, как работает консольная утилита cpaldjvu - а именно, находит доминантный монохромный цвет и отправляет его в задний фон. Это особенность VPD - не документирована, и первым её обнаружил (экспериментальным путём) Minor.
Однако, VPD достаточно неудобен для использования в качестве обычного DjVu-кодировщика (требует неудобной настройки при каждой печати, нуждается в инсталляции, является коммерческим продуктом и т.д.).
Исходя из этого, хотелось бы суметь воспроизвести независимый DjVu-кодировщик, который работал бы так же, как и VPD-DjVu-кодировщик.
Я собрал в этой небольшой статье наиболее интересные места из инструкции по использованию VPD (на английском языке). Давайте внимательно прочитаем это и попробуем сделать некоторые выводы о сущности работы алгоритма сегментации VPD-DjVu-кодировщика (см. в конце статьи).
The Resolution (DPI) option in the Virtual Print Driver is found on the General tab of the Advanced Properties dialog. It allows you to adjust the resolution (in dots per inch) of an output DjVu document. Acceptable values are 50-4800. The default value is 300 dpi, which is suitable for most color documents. Higher resolutions result in increased DjVu file sizes; for many applications lower resolution produces satisfactory results, and increases encoding speed while decreasing resulting file sizes.
1. From the File menu of the printing application, select Print.
2. After selecting LizardTech Virtual Printer in the Printer Name field, click OK. The Virtual Printer dialog appears.
3. Click the Advanced… button on the right, then select the General tab.
4. Enter a resolution value from 50 to 4800, then click OK.
5. Click Encode to create the DjVu document. The value you chose for this option will persist for the virtual printing of further documents until you change it again.
NOTE: For more advanced users, the Resolution option can work in conjunction with the Background Subsampling Factor option in the Advanced Parameters dialog. For more information, see Subsampling the Background. However, any user can simply increase or decrease the resolution DPI if the text of output files is not sufficiently clear.
The Maximum Number of Colors parameter is available on the Foreground tab of the
Advanced Properties dialog. It enables you to set a limit on the number of colors that the
Virtual Print Driver uses to encode foreground objects.
The foreground layer of a DjVu document can contain up to 4,000 colors. Some electronic
source documents, however, contain fewer foreground colors (for example, word-processed
documents that are primarily bitonal). Because the number of foreground colors directly
increases the size of the output DjVu file, you should ensure that only the required
number of colors are used to encode a source document.
You can specify values from 1 to 4,000. The default value is 256, which is suitable for
most color documents. A smaller value results in a decreased DjVu file size.
1. From the File menu of the printing application, select Print.
2. After selecting LizardTech Virtual Printer in the Printer Name field, click OK. The Virtual Printer dialog appears.
3. Click the Advanced… button on the right, then select the Foreground tab.
4. Enter a maximum number of colors, then click OK.
5. Click Encode to create the DjVu document. The value you chose for this option will persist for the virtual printing of further documents until you change it again.
For more information about the foreground layer of a DjVu document, see Layer Types.
When you save an electronic document in DjVu format, the Virtual Print Driver places the objects in the file in either the background or foreground layer of the output DjVu document. The table below describes each of these layers.
Layer | Description |
Background | Contains color or grayscale photographs, pictures, color backgrounds, and other continuous-tone images. Because readability and contrast are less emphasized, these components are typically encoded at one-third the resolution (approximately 100 dpi). |
Foreground | Contains text and line drawings, which have sharp edges and uniform colors. To maintain their high-contrast appearance and readability, these elements are typically encoded at full resolution (approximately 300 dpi). |
The Background Quality option is available on the Background tab of the Advanced Properties dialog. It controls the amount of blurring that appears in the background of a DjVu document. A higher background quality decreases blurring (especially in documents that contain photographs) and produces larger DjVu files. A lower background quality allows some blurring and produces smaller DjVu files. Use the slider bar to increase or decrease the quality accordingly.
Use the following guidelines when choosing a background quality:
If your primary purpose is to archive a copy of a document, adjust the slider bar to a
very high value (for instance, 95) to produce nearly lossless results.
If your primary purpose is to publish a document on the Web, the default value of 75 is
generally effective.
Values lower than 75 should be used only when storage requirements are more important than
visual appearance, or when the background of the source document does not contain varying
colors.
1. From the File menu of the printing application, select Print.
2. After selecting LizardTech Virtual Printer in the Printer Name field, click OK. The Virtual Printer dialog appears.
3. Click the Advanced… button on the right, then select the Background
tab.
4. Use the slider to select a level of quality, then click OK.
5. Click Encode to create the DjVu document. The value you chose for this option will persist for the virtual printing of further documents until you change it again.
Subsampling involves reducing the number of pixels in the background layer of a DjVu
document in order to decrease its file size. Because background objects are generally
low-contrast and smooth, you can subsample this layer without significantly impairing
their appearance.
The Subsampling Factor option is available on the Background tab of the Advanced
Properties dialog. It allows you to subsample the background of a DjVu document by a
factor of 1 to 10. For example, you can subsample the background of a 300 dpi DjVu
document to 100 dpi by specifying a factor of 3 (300 dpi / 100 dpi = 3). In doing so, you
decrease the file size of the DjVu document and ensure that any text that is not placed in
the foreground is still readable.
NOTE: Subsampling is based on the value you enter for the Resolution
option. For more information about this option, see Specifying Resolution.
Because background objects tend to appear blurry below 100 dpi, values 1 and 2 should be
used only in the following instances:
1. From the File menu of the printing application, select Print.
2. After selecting LizardTech Virtual Printer in the Printer Name field, click OK. The Virtual Printer dialog appears.
3. Click the Advanced… button on the right, then select the Background tab.
4. Select a subsampling factor value from 1 to 10, then click OK.
5. Click Encode to create the DjVu document. The value you chose for
this option will persist for the virtual printing of further documents until you change it
again.
For more information about the background layer of a DjVu document, see Layer
Types.
The foreground layer of a DjVu document can contain up to 4,000 colors. Because
foreground objects are encoded at a high resolution to maintain their high-contrast
appearance, objects with several colors increase the file size of the output DjVu
document. The Virtual Print Driver enables you to produce smaller DjVu files by sending
objects with several colors to the background layer, which is encoded at a lower
resolution.
The Send Objects with More Than N Colors option is available on the
Separation tab of the Advanced Properties dialog. It indicates a maximum number of colors
that the Virtual Print Driver can use to encode a foreground object. If an object requires
more colors than this value, it is automatically placed in the background. You can specify
values 1 to 4,000. The default value is 256, which is suitable for most color documents.
1. From the File menu of the printing application, select Print.
2. After selecting LizardTech Virtual Printer in the Printer Name field, click OK. The Virtual Printer dialog appears.
3. Click the Advanced… button on the right, then select the Separation
tab.
4. Select a number between 1 and 4,000, then click OK.
5. Click Encode to create the DjVu document. The value you chose for this
option will persist for the virtual printing of further documents until you change it
again.
For more information, see Defining a Foreground/Background Separation Threshold.
The Virtual Print Driver separates a color document by sending small, uniform objects
to the foreground, and large, multicolored objects to the background. Some objects
however, cannot be classified this easily. The Virtual Print Driver must then use a
separation threshold to determine where to place these "ambiguous" objects.
The threshold is based on each object's shape, color, and overlap with other objects.
The Separation Threshold option is available on the Separation tab of the Advanced
Properties dialog. It enables you to control which objects are placed in the foreground
and the background of an output DjVu document. Using the slider bar, you can specify a
whole number value between 0 and 100. A value of 0 indicates that every ambiguous shape
should be placed in the foreground. A value of 100 indicates that every ambiguous shape
should be placed in the background. The default value is 25, which is suitable for most
color documents.
Determining which ambiguous shapes appear better in the foreground and background layers
requires experimentation. If you are unsatisfied with the appearance of a DjVu document,
print it again using a separation threshold with different values to find the appropriate
balance between foreground and background objects.
1. From the File menu of the printing application, select Print.
2. After selecting LizardTech Virtual Printer in the Printer Name field, click OK. The Virtual Printer dialog appears.
3. Click the Advanced… button on the right, then select the Separation tab.
4. Use the slider bar to adjust the separation threshold, then click OK.
5. Click Encode to create the DjVu document. The value you chose for this option will persist for the virtual printing of further documents until you change it again.
Это свойство аналогично опции -d n у консольной утилиты csepdjvu.
Это свойство несамостоятельно - а лишь является границей применения свойства 5. Send Objects to the Background.
Это свойство аналогично опции -q n,...,n (или -q n+...+n) у консольной утилиты csepdjvu.
Это свойство аналогично снижению размера субскана заднего фона для консольной утилиты csepdjvu.
Это свойство - уникально и пока нигде не реализовано. Предположительно, оно предназначено для сегментирования исключительно малоцветных картинок плакатного типа (Color Line Art) и чем-то напоминает работу консольной утилиты cpaldjvu - с той разницей, что в задний фон отправляется не один монохромный цвет (доминантный), а несколько (т.е. объекты с числом монохромных цветов больше порогового значения). Это свойство можно попытаться реализовать самостоятельно.
Это свойство предположительно аналогично параметру --threshold-level у documenttodjvu.
1. Консольная утилита cpaldjvu осуществляет сегментацию (при кодировании в DjVu), аналогичную сегментации VPD. По крайней мере, для реализации метода Minor'а, утилита cpaldjvu может полностью заменить VPD. Алгоритм работы cpaldjvu также реализован в консольном пакете vpd_enc v1.0 (753 КБ) (на базе csepdjvu).
2. Непонятно, как именно работают опции 5. Send Objects to the Background и 6. Define a Foreground/Background Separation Threshold. Для выяснения механизма их влияния ещё нужно экспериментировать.
3. Можно попробовать самостоятельно реализовать алгоритм работы опции 5. Send Objects to the Background. Есть шансы, что её всё же удастся реализовать в некоей самодельной демо-программе.
Автор: monday2000.
5 ноября 2008 г.
E-Mail (monday2000 [at] yandex.ru)