Вернуться к разделу "Реализация проекта BookScanLib".
Бинаризация White Rohrer Thresholding из проекта gamera применяется для преобразования серой (8-битной) растровой картинки в чёрно-белую (1-битная).
Алгоритм White Rohrer Thresholding анализирует обрабатываемую картинку и автоматически вычисляет порог бинаризации - индивидуальный для каждого пикселя (т.е это локальная или адаптивная бинаризация). Найденный порог используется для обыкновенной пороговой бинаризации в отношении текущего пикселя.
Я написал простейшую консольную программу для демонстрации работы White Rohrer Thresholding. На входе она принимает следующие параметры:
white_rohrer_thres <input_file> <x_lookahead (int)> <y_lookahead (double)> <bias_mode (int)> <bias_factor (int)> <f_factor (int)> <g_factor (int)>
x_lookahead - количество вперёд-анализируемых пикселей. 8 по умолчанию.
y_lookahead - количество вниз-анализируемых пикселей. 1по умолчанию.
bias_mode - параметр. 0 по умолчанию.
bias_factor - параметр. 100 по умолчанию (я поставил 80).
f_factor - параметр. 100 по умолчанию (я поставил 80).
g_factor - параметр. 100 по умолчанию (я поставил 80).
Точность дробных - до одной десятой.
На выходе программа выдаёт этот же файл, обработанный этим алгоритмом.
Программа работает только с серыми изображениями.
Всё необходимое для тестирования этой программы (компиляционный проект, готовый экзешник, файл-пример и bat-файлы для тестирования программы) я оформил в небольшой пакет:
Скачать пакет white_rohrer_thres (45 КБ)
(Для работы программы требуется FreeImage dll-библиотека из пакета FreeImage DLL v3.9.2 - см. статью 1. Знакомство с FreeImage).
Рассмотрим исходные коды этой программы:
/* * * Copyright (C) 2005 John Ashley Burgoyne and Ichiro Fujinaga * 2007 Uma Kompella and Christoph Dalitz * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the Free * Software Foundation; either version 2 of the License, or (at your option) * any later version. * * This program is distributed in the hope that it will be useful, but WITHOUT * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for * more details. * * You should have received a copy of the GNU General Public License along * with this program; if not, write to the Free Software Foundation, Inc., 59 * Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ //////////////////////////////////////////////////////////////////////////////// /* White Rohrer thresholding. This implementation uses code from the XITE library. According to its license, it may be freely included into Gamera (a GPL licensed software), provided the following notice is included into the code: Permission to use, copy, modify and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that this copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation and that the name of B-lab, Department of Informatics or University of Oslo not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. B-LAB DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL B-LAB BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. Important notice: this implementation only works with 8-bit greyscale images because the maximal value 255 for white is hard coded!! */ /* Creates a binary image using White and Rohrer's dynamic thresholding algorithm. It is the first of the two algorithms described in: J. M. White and G. D. Rohrer. 1983. Image thresholding for optical character recognition and other applications requiring character image extraction. *IBM J. Res. Dev.* 27(4), pp. 400-411 The algorithm uses a 'running' average instead of true average of the gray values in the neighborhood. The lookahead parameter gives the number of lookahead pixels used in the biased running average that is used in deciding the threshold at each pixel location. *x_lookahead* the number of lookahead pixels in the horizontal direction for computing the running average. White and Rohrer suggest a value of 8 for a 240 dpi scanning resolution. *y_lookahead* number of lines used for further averaging from the horizontal averages. The other parameters are for calculating biased running average. Without bias the thresholding decision would be determined by noise fluctuations in uniform areas. This implementation uses code from XITE: http://www.ifi.uio.no/forskning/grupper/dsb/Software/Xite/ Parameters: int "x lookahead", default=8 int "y lookahead", default=1 int "bias mode", default=0 int "bias factor", default=100 int "f factor",default=100 int "g factor",default=100 */ // This algorithm was taken from the gamera.sf.net sourcecodes // and adopted for the FreeImage library // // Copyright (C) 2007-2008: // monday2000 monday2000@yandex.ru #include "FreeImage.h" #include "Utilities.h" //////////////////////////////////////////////////////////////////////////////// typedef struct { int WR1_F_OFFSET; int WR1_G_OFFSET; int BIN_ERROR; int BIN_FOREGROUND; int BIN_BACKGROUND; int BIN_OK; int WR1_BIAS_CROSSOVER; int WR1_BLACK_BIAS; int WR1_WHITE_BIAS; int WR1_BIAS; double WR1_BLACK_BIAS_FACTOR; double WR1_WHITE_BIAS_FACTOR; int wr1_f_tab[512]; int wr1_g_tab[512]; } PARAMS; PARAMS wr1_params; //////////////////////////////////////////////////////////////////////////////// inline void SetPixel(BYTE *bits, unsigned x, BYTE* value) { // this function is simplified from FreeImage_SetPixelIndex *value ? bits[x >> 3] |= (0x80 >> (x & 0x7)) : bits[x >> 3] &= (0xFF7F >> (x & 0x7)); } //////////////////////////////////////////////////////////////////////////////// void init_params(void) { PARAMS params = { /* WR1_F_OFFSET */ 255, /* WR1_G_OFFSET */ 255, /* BIN_ERROR */ -1, /* BIN_FOREGROUND */ 0, /* BIN_BACKGROUND */ 255, /* BIN_OK */ 0, /* WR1_BIAS_CROSSOVER */ 93, /* WR1_BLACK_BIAS */ -40, /* WR1_WHITE_BIAS */ 40, /* WR1_BIAS */ 20, /* WR1_BLACK_BIAS_FACTOR */ 0.0, /* WR1_WHITE_BIAS_FACTOR */ -0.25, /* wr1_f_tab */ { -62, -62, -61, -61, -60, -60, -59, -59, -58, -58, -57, -57, -56, -56, -54, -54, -53, -53, -52, -52, -51, -51, -50, -50, -49, -49, -48, -48, -47, -47, -46, -46, -45, -45, -44, -44, -43, -43, -42, -42, -41, -41, -41, -41, -40, -40, -39, -39, -38, -38, -37, -37, -36, -36, -36, -36, -35, -35, -34, -34, -33, -33, -33, -33, -32, -32, -31, -31, -31, -31, -30, -30, -29, -29, -29, -29, -28, -28, -27, -27, -27, -27, -26, -26, -25, -25, -25, -25, -24, -24, -24, -24, -23, -23, -23, -23, -22, -22, -22, -22, -21, -21, -21, -21, -20, -20, -20, -20, -19, -19, -19, -19, -18, -18, -18, -18, -17, -17, -17, -17, -16, -16, -16, -16, -16, -16, -15, -15, -15, -15, -14, -14, -14, -14, -14, -14, -13, -13, -13, -13, -13, -13, -12, -12, -12, -12, -12, -12, -11, -11, -11, -11, -11, -11, -10, -10, -10, -10, -10, -10, -9, -9, -9, -9, -9, -9, -8, -8, -8, -8, -8, -8, -8, -8, -7, -7, -7, -7, -7, -7, -7, -7, -6, -6, -6, -6, -6, -6, -6, -6, -5, -5, -5, -5, -5, -5, -5, -5, -4, -4, -3, -3, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 0 }, /* wr1_g_tab */ { -126, -126, -125, -125, -124, -124, -123, -123, -122, -122, -121, -121, -120, -120, -119, -119, -118, -118, -117, -117, -116, -116, -115, -115, -114, -114, -113, -113, -112, -112, -111, -111, -110, -110, -109, -109, -108, -108, -107, -107, -106, -106, -105, -105, -104, -104, -103, -103, -102, -102, -101, -101, -100, -100, -99, -99, -98, -98, -97, -97, -96, -96, -95, -95, -94, -94, -93, -93, -92, -92, -91, -91, -90, -90, -89, -89, -88, -88, -87, -87, -86, -86, -85, -85, -84, -84, -83, -83, -82, -82, -81, -81, -80, -80, -79, -79, -78, -78, -77, -77, -76, -76, -75, -75, -74, -74, -73, -73, -72, -72, -71, -71, -70, -70, -69, -69, -68, -68, -67, -67, -66, -66, -65, -65, -64, -64, -63, -63, -61, -61, -59, -59, -57, -57, -54, -54, -52, -52, -50, -50, -48, -48, -46, -46, -44, -44, -42, -42, -41, -41, -39, -39, -37, -37, -36, -36, -34, -34, -33, -33, -31, -31, -30, -30, -29, -29, -27, -27, -26, -26, -25, -25, -24, -24, -23, -23, -22, -22, -21, -21, -20, -20, -19, -19, -18, -18, -17, -17, -16, -16, -15, -15, -14, -14, -14, -14, -13, -13, -12, -12, -12, -12, -11, -11, -10, -10, -10, -10, -9, -9, -8, -8, -8, -8, -7, -7, -7, -7, -6, -6, -6, -6, -5, -5, -5, -5, -4, -4, -2, -2, -2, -2, -2, -2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 10, 10, 10, 10, 11, 11, 12, 12, 12, 12, 13, 13, 14, 14, 14, 14, 15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 20, 20, 21, 21, 22, 22, 23, 23, 24, 24, 25, 25, 26, 26, 27, 27, 29, 29, 30, 30, 31, 31, 33, 33, 34, 34, 36, 36, 37, 37, 39, 39, 41, 41, 42, 42, 44, 44, 46, 46, 48, 48, 50, 50, 52, 52, 54, 54, 57, 57, 59, 59, 61, 61, 63, 63, 64, 64, 65, 65, 66, 66, 67, 67, 68, 68, 69, 69, 70, 70, 71, 71, 72, 72, 73, 73, 74, 74, 75, 75, 76, 76, 77, 77, 78, 78, 79, 79, 80, 80, 81, 81, 82, 82, 83, 83, 84, 84, 85, 85, 86, 86, 87, 87, 88, 88, 89, 89, 90, 90, 91, 91, 92, 92, 93, 93, 94, 94, 95, 95, 96, 96, 97, 97, 98, 98, 99, 99, 100, 100, 101, 101, 102, 102, 103, 103, 104, 104, 105, 105, 106, 106, 107, 107, 108, 108, 109, 109, 110, 110, 111, 111, 112, 112, 113, 113, 114, 114, 115, 115, 116, 116, 117, 117, 118, 118, 119, 119, 120, 120, 121, 121, 122, 122, 123, 123, 124, 124, 125, 125, 126, 126, 127, 127, 127, 0 } }; wr1_params = params; }; //////////////////////////////////////////////////////////////////////////////// inline int wr1_bias (int x, int offset) { int result; int bias; x = 256 - x; bias = -offset; if (x < wr1_params.WR1_BIAS_CROSSOVER) { result = x - bias - (int)(wr1_params.WR1_BLACK_BIAS_FACTOR*(wr1_params.WR1_BIAS_CROSSOVER-x)); } else if (x >= wr1_params.WR1_BIAS_CROSSOVER) { result = x + bias + (int)(wr1_params.WR1_WHITE_BIAS_FACTOR*(x-wr1_params.WR1_BIAS_CROSSOVER)); } else result = x; if (result < wr1_params.BIN_FOREGROUND) result = wr1_params.BIN_FOREGROUND; if (result > wr1_params.BIN_BACKGROUND) result = wr1_params.BIN_BACKGROUND; return 256 - result; } //////////////////////////////////////////////////////////////////////////////// inline int wr1_f (int diff, int *f) { f[0] = -wr1_params.wr1_f_tab[wr1_params.WR1_F_OFFSET - diff]; return wr1_params.BIN_OK; } //////////////////////////////////////////////////////////////////////////////// inline int wr1_g (int diff, int *g) { g[0] = -wr1_params.wr1_g_tab[wr1_params.WR1_G_OFFSET - diff]; return wr1_params.BIN_OK; } //////////////////////////////////////////////////////////////////////////////// /* * OneBit white_rohrer_threshold(GreyScale src, * int x_lookahead, * int y_lookahead, * int bias_mode, * int bias_factor, * int f_factor * int g_factor); */ FIBITMAP* ProcessFilter(FIBITMAP* src_dib, int x_lookahead, int y_lookahead, int bias_mode, int bias_factor, int f_factor, int g_factor) { unsigned width = FreeImage_GetWidth(src_dib); unsigned height = FreeImage_GetHeight(src_dib); unsigned src_pitch = FreeImage_GetPitch(src_dib); unsigned bpp = FreeImage_GetBPP(src_dib); FIBITMAP* dst_dib = FreeImage_Allocate(width, height, 1); // Build a monochrome palette RGBQUAD *pal = FreeImage_GetPalette(dst_dib); pal[0].rgbRed = pal[0].rgbGreen = pal[0].rgbBlue = 0; pal[1].rgbRed = pal[1].rgbGreen = pal[1].rgbBlue = 255; unsigned dst_pitch = FreeImage_GetPitch(dst_dib); BYTE* src_bits = (BYTE*)FreeImage_GetBits(src_dib); // The image raster BYTE* dst_bits = (BYTE*)FreeImage_GetBits(dst_dib); // The image raster BYTE* lines, *lined; unsigned x, y; BYTE val; init_params(); int u; int prevY; int Y = 0; int f, g; int x_ahead, y_ahead; int t; int offset = wr1_params.WR1_BIAS; double mu = 0.0; double s_dev = 0.0; int *Z; int n; x_lookahead = x_lookahead % width; if (bias_mode == 0) { unsigned sum = 0; for (y = 0; y < height; y++) { lines = src_bits + y * src_pitch; for (x = 0; x < width; x++) sum += (unsigned)lines[x]; } mu = (double)sum / (width*height); // image_mean unsigned sqr_sum = 0; for (y = 0; y < height; y++) { lines = src_bits + y * src_pitch; for (x = 0; x < width; x++) sqr_sum += (unsigned)lines[x]*(unsigned)lines[x]; } s_dev = sqrt((double)sqr_sum / (width*height) - mu * mu); // image_variance offset = (int)(s_dev - 40) ; } else offset = bias_mode; Z = new int[2*width+1]; for(n = 0; n< 2*width+1; ++n) Z[n] = 0; Z[0] = prevY = (int)mu; for (y=0; y< 1+y_lookahead; y++) { lines = src_bits + y * src_pitch; if (y < y_lookahead) t = width; else t = x_lookahead; for (x=0; x< t; x++) { u = lines[x]; wr1_f (u-prevY, &f); Y = prevY + f; if (y == 1) Z[x] = (int)mu; else { wr1_g(Y-Z[x], &g); Z[x] = Z[x] + g; } } } x_ahead = 1 + x_lookahead; y_ahead = 1 + y_lookahead; for (y = 0; y < height; y++) { lines = src_bits + y * src_pitch; lined = dst_bits + y * dst_pitch; for (x = 0; x < width; x++) { if (lines[x] < (bias_factor * wr1_bias(Z[x_ahead],offset) / 100)) val = 0; else val = 255; SetPixel(lined, x, &val); x_ahead++; if (x_ahead > width) { x_ahead = 1; y_ahead++; } if (y_ahead <= height) { prevY = Y; wr1_f((src_bits+x_ahead)[y_ahead]-prevY, &f); Y = prevY + f_factor * f / 100; wr1_g(Y-Z[x_ahead], &g); Z[x_ahead] = Z[x_ahead] + g_factor * g / 100; } else Z[x_ahead] = Z[x_ahead-1]; } } delete [] Z; // Copying the DPI... FreeImage_SetDotsPerMeterX(dst_dib, FreeImage_GetDotsPerMeterX(src_dib)); FreeImage_SetDotsPerMeterY(dst_dib, FreeImage_GetDotsPerMeterY(src_dib)); return dst_dib; } //////////////////////////////////////////////////////////////////////////////// /** FreeImage error handler @param fif Format / Plugin responsible for the error @param message Error message */ void FreeImageErrorHandler(FREE_IMAGE_FORMAT fif, const char *message) { printf("\n*** "); printf("%s Format\n", FreeImage_GetFormatFromFIF(fif)); printf(message); printf(" ***\n"); } //////////////////////////////////////////////////////////////////////////////// /** Generic image loader @param lpszPathName Pointer to the full file name @param flag Optional load flag constant @return Returns the loaded dib if successful, returns NULL otherwise */ FIBITMAP* GenericLoader(const char* lpszPathName, int flag) { FREE_IMAGE_FORMAT fif = FIF_UNKNOWN; // check the file signature and deduce its format // (the second argument is currently not used by FreeImage) fif = FreeImage_GetFileType(lpszPathName, 0); FIBITMAP* dib; if(fif == FIF_UNKNOWN) { // no signature ? // try to guess the file format from the file extension fif = FreeImage_GetFIFFromFilename(lpszPathName); } // check that the plugin has reading capabilities ... if((fif != FIF_UNKNOWN) && FreeImage_FIFSupportsReading(fif)) { // ok, let's load the file dib = FreeImage_Load(fif, lpszPathName, flag); // unless a bad file format, we are done ! if (!dib) { printf("%s%s%s\n","File \"", lpszPathName, "\" not found."); return NULL; } } return dib; } //////////////////////////////////////////////////////////////////////////////// int main(int argc, char *argv[]) { // call this ONLY when linking with FreeImage as a static library #ifdef FREEIMAGE_LIB FreeImage_Initialise(); #endif // FREEIMAGE_LIB // initialize your own FreeImage error handler FreeImage_SetOutputMessage(FreeImageErrorHandler); if(argc != 8) { printf("Usage : white_rohrer_thres <input_file> <x_lookahead (int)> <y_lookahead (double)> <bias_mode (int)> <bias_factor (int)> <f_factor (int)> <g_factor (int)>\n"); return 0; } FIBITMAP *dib = GenericLoader(argv[1], 0); int x_lookahead = atoi(argv[2]); int y_lookahead = atoi(argv[3]); int bias_mode = atoi(argv[4]); int bias_factor = atoi(argv[5]); int f_factor = atoi(argv[6]); int g_factor = atoi(argv[7]); if (dib) { // bitmap is successfully loaded! if (FreeImage_GetImageType(dib) == FIT_BITMAP) { if (FreeImage_GetBPP(dib) == 8) { FIBITMAP* dst_dib = ProcessFilter(dib, x_lookahead, y_lookahead, bias_mode, bias_factor, f_factor, g_factor); if (dst_dib) { // save the filtered bitmap const char *output_filename = "filtered.tif"; // first, check the output format from the file name or file extension FREE_IMAGE_FORMAT out_fif = FreeImage_GetFIFFromFilename(output_filename); if(out_fif != FIF_UNKNOWN) { // then save the file FreeImage_Save(out_fif, dst_dib, output_filename, 0); } // free the loaded FIBITMAP FreeImage_Unload(dst_dib); } } else printf("%s\n", "Unsupported color mode."); } else // non-FIT_BITMAP images are not supported. printf("%s\n", "Unsupported color mode."); FreeImage_Unload(dib); } // call this ONLY when linking with FreeImage as a static library #ifdef FREEIMAGE_LIB FreeImage_DeInitialise(); #endif // FREEIMAGE_LIB return 0; } |
Я не стал разбираться с принципом работы данного алгоритма - пока не вижу смысла (он довольно громоздок и вряд ли его части встретятся в других алгоритмах).
В описании сказано, что он динамически анализирует малую окрестность текущего пикселя, и в зависимости от этого подбирает порог бинаризации для текущего пикселя.
Подобное поведение и укладывается по-настоящему и самым истинным образом в определение "адаптивный" алгоритм.
Данный алгоритм бинаризации показался мне в первом приближении довольно интересным и перспективным. Во-первых, он очень быстрый. Во-вторых, варьируя последние 3 параметра в диапазоне от 0 до 100, можно очень тонко регулировать степень и качество бинаризации. Это очень напомнило мне регулировки алгоритма бинаризации из BookRestorer v4.1.
XITE: http://www.ifi.uio.no/forskning/grupper/dsb/Software/Xite/