基于多尺度条件随机场的文本图像二值化*

doi:10.11925/infotech.1003-3513.2009.04.15

现代图书情报技术

2009, Vol. 25

Issue (4): 79-81 https://doi.org/10.11925/infotech.1003-3513.2009.04.15

应用实践

本期目录 | 过刊浏览 | 高级检索

基于多尺度条件随机场的文本图像二值化*

刘坤吕学强王涛施水才

(北京信息科技大学中文信息处理研究中心北京 100101)
(北京拓尔思信息技术股份有限公司北京 100101)

Binarization for Document Image Based on Multi-scale Conditional Random Fields

Liu Kun Lv Xueqiang Wang Tao Shi Shuicai

(Chinese Information Processing Research Center, Beijing Information Science &Technology University, Beijing 100101,China)
(Beijing TRS Information Technology Co.Ltd., Beijing 100101, China)

摘要
参考文献
相关文章
Metrics

全文: PDF (382 KB)
输出: BibTeX | EndNote (RIS)

摘要

提出一种基于多尺度条件随机场（简称mCRF）的图像二值化算法。该算法将对图像的二值化看作一个标注过程，利用mCRF模型对图像中每个像素点进行标记，从而实现对整幅图像的二值化。mCRF模型属于判别式模型，可以容纳任意的非独立特征，从而充分利用图像本身信息。实验结果表明，本算法比常用的阈值法效果有很大提高。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	刘坤
	吕学强
	王涛
	施水才

关键词 ：文本图像, 二值化, 多尺度条件随机场, 特征函数

Abstract：

This paper proposes a new algorithm based on multi-scale conditional random fields. This algorithm treats the binarization as a tagging process, using mCRF to label every pixel in the image, so as to realize the binarization of the full image. MCRF of discriminate model can accommodate any of the non-independent features, which makes full use of information in the image. From the result can see this algorithm is better than common threshold method in effect.

Key words： Document image Binarization mCRF Feature function

收稿日期: 2008-11-21 出版日期: 2009-04-25

TP 391

基金资助:

*本文系863计划重点项目“跨媒体搜索关键技术研究及服务产品开发”(项目编号：2006AA010105)、北京市属高等学校人才强教计划项目“创新团队-智能搜索引擎和文本挖掘”(项目编号：PXM2007_014224_044677)和国家自然科学基金项目“基于语义分析和统计的自动主题标引研究”（项目编号：60872133）的研究成果之一。

通讯作者: 刘坤 E-mail: liukun2007@yahoo.com.cn

作者简介: 刘坤,吕学强,王涛,施水才

引用本文:

刘坤,吕学强,王涛,施水才. 基于多尺度条件随机场的文本图像二值化*[J]. 现代图书情报技术, 2009, 25(4): 79-81.
Liu Kun,Lv Xueqiang,Wang Tao,Shi Shuicai. Binarization for Document Image Based on Multi-scale Conditional Random Fields. New Technology of Library and Information Service, 2009, 25(4): 79-81.

链接本文:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2009.04.15 或 https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2009/V25/I4/79

［1］陈丹，张蜂，贺贵明. 一种改进的文本图像二值化算法［J］. 计算机工程, 2003, 29(13): 85-86.
［2］ He X, Zemel R, Carreira-perpinan M. Multiscale Conditional Random Fields for Image Labeling ［C］. In: IEEE Conference. Computer Vision and Pattern Recognition, 2004: 695-702.
［3］ Derin H, Elliott H. Modeling and Segmentation of Noisy and Textured Images Using Gibbs Random Fields ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, 9:39-55.
［4］ Lafferty J, McCallum A, Pereira F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data ［C］. In: ICML, 2001: 282-289.
［5］ Hinton G E. Training Products of Experts by Minimizing Contrastive Divergence ［J］. Neural Comp, 2002, 14(8): 1771-1800.

[1]	王昊, 邹杰利, 邓三鸿. 面向中文图书的自动标引模型构建及实验分析[J]. 现代图书情报技术, 2013, 29(7/8): 55-62.
[2]	王文哲. 数字文本资料修复中的字符分割法及应用*[J]. 现代图书情报技术, 2010, 26(3): 82-85.
[3]	苏东出,陈和平,孙萍. 基于高低通滤波特征的文本图像快速二值化方法——谈数字图像处理技术在数字图书馆中的应用[J]. 现代图书情报技术, 2005, 21(3): 43-44.

Viewed

Full text

Abstract

Cited

Shared

Discussed