﻿ 知识组织系统自动映射规则研究与实现<sup>*</sup>——以《杜威十进分类法》和《中国图书馆分类法》为例

Study and Implementation on the Automatic Mapping Rules Between Knowledge Organization Systems——The Case of the Dewey Decimal Classification and the Chinese Library Classification
Qu Jianfeng, Li Fang, Zhang Yihua, JLi Bao
Shanghai Jiaotong University Library, Shanghai 200240, China
Abstract

This paper applies mathematical statistics corpus linguistics to generate mapping rules between DDC(Dewey Decimal Classification) and CLC(Chinese Library Classification). A test system is then built up and the DDC-CLC mapping table is produced through the system. The mapping table is examined by bibliographic records with DDC and CLC data so that continuously improved mapping rules and tables can be obtained.

Keyword: Dewey decimal classification; Chinese library classification; Automatic mapping; Knowledge organization systems; Mapping rules
1 引言

2 基于数据统计的映射规则研究

2.1 基于数据统计的映射规则

P(x)=px x=0,1

L(X1,X2,…,Xn;p)=

ln[L(X1,X2,…,Xn;p)]=ln ∑xi+ ln

= - = =0

p=

p(X=xi)≥80%

2.2 映射关系表的扩展

3 DDC-CLC映射规则试验系统的应用

 Figure Option 图1 DDC-CLC映射规则试验系统的应用框架

(1)表现层:用户输入测试的参数并提交相应的数据给DDC-CLC映射规则试验系统,系统根据统计模型计算出相应的DDC-CLC映射关系表,并且用户可以从系统中提取已计算出映射关系表的部分中间结果集。

(2)应用层:DDC-CLC映射规则试验系统应用统计映射规则模型来计算DDC-CLC映射关系表,并提供用户所需要的参数更改接口,从而不断地完善映射规则。

(3)数据存储层:收集各类样本数据,包含来自确定来源的样本数据和从网上的云资源中采集到的数据,并按照一定格式存储;在形成DDC-CLC映射关系表时,会存在一些无法得到映射关系的类,此时就需要补充人工分类来保证DDC-CLC映射关系表的完整性。

4 实施方案
4.1 DDC-CLC映射规则试验系统的流程

 Figure Option 图2 DDC-CLC映射规则算法试验

4.2 DDC-CLC映射规则算法试验系统的实现

(1)汇总统计模块

(2)数据处理模块

 Figure Option 图3 DDC-CLC对应数据一对多数据

(3)合并上位类模块

(4)结果展示

5 结语

 [1] 司莉. 知识组织系统的互操作及其实现[J]. 现代图书情报技术, 2007(3): 29-34. (Si Li. Interoperability and Its Implementation Among Knowledge Organization Systems[J]. New Technology of Library and Information Service, 2007(3): 29-34. ) [本文引用:1] [CJCR: 1.073] [2] FAST (Faceted Application of Subject Terminology)[EB/OL]. [ 2012-09-10]. FAST (Faceted Application of Subject Terminology)[EB/OL]. [2012-09-10]. http://www.oclc.org/research/activities/fast.html. [本文引用:1] [3] RVM(Repertoire de Vedettes-Matiers)[EB/OL]. [ 2012-09-10]. https: //rvmweb. bibl. ulaval. RVM(Repertoire de Vedettes-Matiers) [EB/OL]. [2012-09-10]. https: //rvmweb. bibl. ulaval. ca/en/a-propos. [本文引用:1] [4] Getty Vocabularies[EB/OL]. [ 2012-09-10]. Getty Vocabularies[EB/OL]. [2012-09-10]. http://www.getty.edu/research/tools/vocabularies/. [本文引用:1] [5] ERIC(Education Resources Information Center)[EB/OL]. [ 2012-09-10]. ERIC(Education Resources Information Center)[EB/OL]. [2012-09-10]. http://www.eric.ed.gov/. [本文引用:1] [6] VRA(Visual Resource Association) [EB/OL]. [ 2012-09-10]. VRA(Visual Resource Association) [EB/OL]. [2012-09-10]. http://www.vraweb.org/. [本文引用:1] [7] UMLS Metathesaurus[EB/OL]. [ 2012-09-10]. UMLS Metathesaurus[EB/OL]. [2012-09-10]. http://www.nlm.nih.gov/pubs/factsheets/umlsmeta.html. [本文引用:1] [8] HILT (High-level Thesaurus Project [EB/OL]. [ 2012-09-10]. HILT (High-level Thesaurus Project [EB/OL]. [2012-09-10]. http://www.worldcat.org/wcidentities/nc-hilt%20project. [本文引用:1] [9] Zeng M L, Chan L M. Trends and Issues in Establishing Interoperability Among Knowledge Organization Systems[J]. Journal of the American Society for Information Science and Technology, 2004, 55(5): 377-395 [本文引用:1] [JCR: 2.005] [10] 刘晓鹏, 真溱, 于洋. 基于统计的知识组织系统自动映射方法研究[J]. 数字图书馆论坛, 2009(12): 75-78. (Liu Xiaopeng, Zhen Zhen, Yu Yang. Study on Statistics-based Method for Automatic Mapping of Knowledge Organization Systems[J]. Digital Library Form, 2009(12): 75-78. ) [本文引用:1] [11] 戴剑波, 侯汉清. 图书分类法映射系统设计原理——以《中国图书馆分类法》和《杜威十进分类法》为例[J]. 情报学报, 2005, 24(3): 299-303. (Dai Jianbo, Hou Hanqing. Principle of the Automatic Mapping System of Library Classification[J]. Journal of the China Society for Scientific and Technical Information, 2005, 24(3): 299-303. ) [本文引用:1] [CJCR: 1.1348] [12] 戴剑波, 侯汉清. 文献分类法自动映射系统的构建——以《中国图书馆分类法》与《杜威十进分类法》为例[J]. 情报学报, 2006, 25(5): 594-599. (Dai Jianbo, Hou Hanqing. Construction and Use of Automatic Mapping System Between CLC and DDC[J]. Journal of the China Society for Scientific and Technical Information, 2006, 25(5): 594-599. ) [本文引用:1] [CJCR: 1.1348] [13] 贾君枝, 郝倩倩. DDC 与《中图法》组合类目映射探讨[J]. 中国图书馆学报, 2012, 38(4): 63-70. (Jia Junzhi, Hao Qianqian. Mapping of Combined Category Between Chinese Library Classification and DDC[J]. Journal of Library Science in China, 2012, 38(4): 63-70. ) [本文引用:1] [CJCR: 2.697] [14] 李波, 戴秀梅, 侯汉清. 计算机建立分类法和主题词表转换系统的尝试[J]. 现代情报, 2003(6): 112-115. (Li Bo, Dai Xiumei, Hou Hanqing. Attempt to Establish a Computer-aided Conversion System of Classification and Thesauri[J]. Journal of Modern Information, 2003(6): 112-115. ) [本文引用:1] [15] 章成志, 苏兰芳, 苏新宁. 基于多语境的相关词自动提取系统的设计与实现[J]. 现代图书情报技术, 2006(9): 23-29. (Zhang Chengzhi, Su Lanfang, Su Xinning. Design and Implementation of Automatic Extraction Relevance Terms System Based on Multi-context[J]. New Technology of Library and Information Service, 2006(9): 23-29. ) [本文引用:1] [CJCR: 1.073] [16] 张雪英. 基于并行文献数据库的索引语言概念兼容转换[J]. 情报学报, 2005, 24(2): 161-168. (Zhang Xueying. Conceptual Integration of Indexing Languages Based on Parallel Document Databases[J]. Journal of the China Society for Scientific and Technical Information, 2005, 24(2): 161-168. ) [本文引用:1] [CJCR: 1.1348] [17] Zhang Y, Peng J, Huang D, et al. Design of Automatic Mapping System Between DDC and CLC[C] In: Proceedings of the 13th International Conference on Asia-Pacific Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation. Berlin: Springer-Verlag, 2011: 357-366. [本文引用:1] [18] 国家图书馆 《中国图书馆分类法》编辑委员会. 中国图书馆分类法[M]. 北京: 国家图书馆出版社, 2010. (The National Library of China Library Classification Editorial Committee. Chinese Library Classification[M]. Beijing: National Library of China Publishing House, 2010. ) [本文引用:1] [19] Dewey M, Mitchell J S. Dewey Decimal Classification and Relative Index[M]. Ohio: OCLC Online Computer Library Center, 2011. [本文引用:1] [20] 薛留根. 概率统计问题与思考[M]. 北京: 科学出版社, 2011. (Xue Liugen. Problems and Reflections about Probability and Statistics[M]. Beijing: Science Press, 2011. ) [本文引用:1] [21] 杨虎, 钟波, 刘琼荪. 应用数理统计[M]. 北京: 清华大学出版社, 2006. (Yang Hu, Zhong Bo, Liu Qiongsun. Application of Mathematical Statistics[M]. Beijing: Tsinghua University Press, 2006. )(作者E-mail: jfqu@lib. sjtu. edu. cn) [本文引用:1]