Please wait a minute...
New Technology of Library and Information Service  2014, Vol. 30 Issue (10): 25-32    DOI: 10.11925/infotech.1003-3513.2014.10.05
Current Issue | Archive | Adv Search |
Study on Text Visualization of Clustering Result for Domain Knowledge Base —— Take Knowledge Base of Chinese Cuisine Culture as the Object
Xu Xin, Hong Yunjia
Department of Information Science, Business School, East China Normal University, Shanghai 200241, China
Download: PDF(2657 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] An intuitive navigation is provided to users by the text visualization of clustering results in the domain knowledge base. [Methods] The visual navigation of the texts in the domain knowledge base is realized by the procedures of topic discovery, dimensional reduction and visual display based on the automatic multi-level text organization by clustering. [Results] An algorithm of topic extraction named TF-ICF is put forward, and the visual display of domain knowledge base is realized by the optimized tree map and scatter diagram to help users know about the overview of knowledge base, find the required topics, understand the relation between different texts. [Limitations] The visual display partly depends on the manual participation, and the interaction of the visualization needs to optimize further. [Conclusions] The visualization method is applied successfully in domain knowledge base and helps to optimize the users' experiences further.

Key wordsText visualization      Text clustering      Domain knowledge base      Chinese cuisine culture     
Received: 07 May 2014      Published: 28 November 2014
:  G250.7  

Cite this article:

Xu Xin, Hong Yunjia. Study on Text Visualization of Clustering Result for Domain Knowledge Base —— Take Knowledge Base of Chinese Cuisine Culture as the Object. New Technology of Library and Information Service, 2014, 30(10): 25-32.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2014.10.05     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2014/V30/I10/25

[1] 张鸣. 知识服务方式之一——构建学科专题知识库[J]. 图书馆学刊, 2006, 28(3): 108-110. (Zhang Ming. One Model of Knowledge Service in Network Era——Constructing Knowledge Storehouse of Specialized Subject [J]. Journal of Library Science, 2006, 28(3): 108-110.)
[2] 钱智勇. 基于本体的专题域知识库系统设计与实现——以张謇研究专题知识库系统实现为例[J]. 情报理论与实践, 2006, 29(4): 476-479. (Qian Zhiyong. Design & Realization of the Ontology-based Subject Domain Knowledge Base System [J]. Information Studies: Theory & Application, 2006, 29(4): 476-479.)
[3] 闫洪森, 张野, 孙娜, 等. 基于本体的知识库构建方法[J]. 情报科学, 2007, 25(9): 1398-1400, 1408. (Yan Hongsen, Zhang Ye, Sun Na, et al. Construction Method of Knowledge Database Based on Ontology[J]. Information Science, 2007, 25(9): 1398-1400, 1408.)
[4] 许鑫, 郭金龙. 基于领域本体的专题库构建——以中华烹饪文化知识库为例[J]. 现代图书情报技术, 2013(12): 2-9. (Xu Xin, Guo Jinlong. Construction of Subject Knowledge Base ——Taking the Domain of Chinese Cuisine Culture as an Example [J]. New Technology of Library and Information Service, 2013(12): 2-9.)
[5] 洪韵佳, 许鑫. 基于领域本体的知识库多层次文本聚类研究——以中华烹饪文化知识库为例[J]. 现代图书情报技术, 2013(12): 19-26. (Hong Yunjia, Xu Xin. Study on Multi-level Text Clustering for Knowledge Base Based on Domain Ontology——Taking Knowledge Base of Chinese Cuisine Culture as an Example [J]. New Technology of Library and Information Service, 2013(12): 19-26.)
[6] Don A, Zheleva E, Gregory M, et al. Discovering Interesting Usage Patterns in Text Collections: Integrating Text Mining with Visualization[C]. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM'07). New York: ACM, 2007: 213-222.
[7] Luo D, Yang J, Krstajic M, et al. EventRiver: Visually Exploring Text Collections with Temporal References [J]. IEEE Transactions on Visualization and Computer Graphics, 2012, 18(1): 93-105.
[8] Pearson K. Onlines and Planes of Closest Fit to Systems of Points in Space [J]. Philosophical Magazine, 1901, 2(6): 559-572.
[9] Scholkopf B, Smola A, Muller K. Nonlinear Component Analysis as a Kernel Eigenvalue Problem[J]. Neural Computation, 1998, 10(5): 1299-1319.
[10] 冯燕, 何明一, 宋江红, 等. 基于独立成分分析的高光谱图像数据降维及压缩[J]. 电子与信息学报, 2007, 29(12): 2871-2875. (Feng Yan, He Mingyi, Song Jianghong, et al. ICA-Based Dimensionality Reduction and Compression of Hyperspectral Images [J]. Journal of Electronics & Information Technology, 2007, 29(12): 2871-2875.)
[11] Pu J, Kalyanaraman Y, Jayanti S, et a1. Navigation and Discovery in 3D CAD Repositories [J]. IEEE Computer Graphics and Applications, 2007, 27(4): 38-47.
[12] Jee T, Lee H, Lee Y. Visualization of Document Retrieval Using External Cluster Relationship [J]. Journal of Information Science and Engineering, 2013, 29 (1): 35-48.
[13] 任永功. 面向聚类的数据可视化方法及相关技术研究[D]. 沈阳: 东北大学, 2006. (Ren Yonggong. Study on Data Visualization Methods and Related Techniques for Clustering[D]. Shenyang: Northeastern University, 2006.)
[14] 薛浩, 马静, 朱恒民, 等. 基于SOM聚类的文本挖掘知识展现可视化研究[J]. 情报理论与实践, 2009, 32(7): 120-123. (Xue Hao, Ma Jing, Zhu Hengmin, et al. Research on Knowledge Visualization of Text Mining Based on SOM Cluster [J]. Information Studies: Theory & Application, 2009, 32(7): 120-123.)
[15] 杨钤雯, 寇纪淞, 陈富赞, 等. 基于本体的语义网络会话聚类和可视化方法[J]. 模式识别与人工智能, 2011, 24(1): 111-116. (Yang Qianwen, Kou Jisong, Chen Fuzan, et al. Semantic Web Session Clustering and Visualization Method Based on Ontology [J]. Pattern Recognition and Artificial Intelligence, 2011, 24(1): 111-116.)
[16] 任永功, 于戈. 一种多维数据的聚类算法及其可视化研究[J]. 计算机学报, 2005, 28(11): 1861-1865. (Ren Yonggong, Yu Ge. Clustering for Multi-Dimensional Data and Its Visualization[J]. Chinese Journal of Computers, 2005, 28(11): 1861-1865.)
[17] Krishman M, Bohn S, Cowley W, et al. Scalable Visual Analytics of Massive Textual Datasets [C]. In: Proceedings of the 21st International Parallel and Distributed Processing Symposium, Long Beach, CA, US. IEEE, 2007: 26-30.
[18] 王伟. 基于网络信息的热点事件发现与分析研究——以创业板上市公司为例[D]. 上海: 华东师范大学, 2011. (Wang Wei. Hot Event Detection and Analysis Based on Internet Information - Case Studies on GEM Listed Companies [D]. Shanghai: East China Normal University, 2011.)
[19] Tirunagari S, Hänninen M, Stählberg K, et al. Mining Causal Relations and Concepts in Maritime Accidents Investigation Reports[C]. In: Proceedings of International Conference cum Exhibition on Technology of the Sea, Visakhapatnam, India. 2012: 548-566.
[20] 赵琦, 张智雄, 孙坦, 等. 主题发现技术方法研究[J]. 情报理论与实践, 2009, 32(4): 104-108. (Zhao Qi, Zhang Zhixiong, Sun Tan, et al. Study on Topic Discovery Technology [J]. Information Studies: Theory & Application, 2009, 32(4): 104-108.)
[21] 王小华, 徐宁, 谌志群. 基于共词分析的文本主题词聚类与主题发现[J]. 情报科学, 2011, 29 (11): 1621-1624. (Wang Xiaohua, Xu Ning, Chen Zhiqun. Discovering of Subjects and Clustering of Textual Subject Terms Based on Co-word Analysis[J]. Information Science, 2011, 29(11): 1621-1624.)
[22] Fortuna B, Mladenic D, Crobelnik M. Semi-automatic Construction of Topic Ontologies [C]. In: Proceedings of the 2005 Joint International Conference on Semantics, Web and Mining (EWMF'05/KDO'05). Berlin, Heidelberg: Springer- Verlag, 2006: 121-131.
[23] 钟伟金, 李佳. 共词分析法研究(一)——共词分析的过程与方式[J]. 情报杂志, 2008 (5): 70-72. (Zhong Weijin, Li Jia. The Research of Co-word Analysis (1) - the Process and Methods of Co-word Analysis [J]. Journal of Information, 2008 (5): 70-72.)
[24] 马连浩. Web文本聚类技术及聚类结果可视化研究[D]. 大连: 大连交通大学, 2008. (Ma Lianhao. Research of Web Text Clustering Technology and Clustering Result Visualization [D]. Dalian: Dalian Jiaotong University, 2008.)

[1] Yanan Yang,Wenhui Zhao,Jian Zhang,Shen Tan,Beibei Zhang. Visualizing Policy Texts Based on Multi-View Collaboration[J]. 数据分析与知识发现, 2019, 3(6): 30-41.
[2] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
[3] Tao Zhang,Haiqun Ma. Clustering Policy Texts Based on LDA Topic Model[J]. 数据分析与知识发现, 2018, 2(9): 59-65.
[4] Qin Guan, Sanhong Deng, Hao Wang. Chinese Stopwords for Text Clustering: A Comparative Study[J]. 数据分析与知识发现, 2017, 1(3): 72-80.
[5] Guo Chen,Lu Xiao. Linking Knowledge Elements from Online Community[J]. 数据分析与知识发现, 2017, 1(11): 75-83.
[6] Chen Dongyi,Zhou Zicheng,Jiang Shengyi,Wang Lianxi,Wu Jialin. A Framework for Customer Segmentation on Enterprises’ Microblog[J]. 现代图书情报技术, 2016, 32(2): 43-51.
[7] Gong Kaile,Cheng Ying,Sun Jianjun. Clustering Blog Posts with Co-occurrence Analysis[J]. 现代图书情报技术, 2016, 32(10): 50-58.
[8] Gu Xiaoxue, Zhang Chengzhi. Using Content and Tags for Web Text Clustering[J]. 现代图书情报技术, 2014, 30(11): 45-52.
[9] Deng Sanhong,Wan Jiexi,Wang Hao,Liu Xiwen. Experimental Study of Multilingual Text Clustering[J]. 现代图书情报技术, 2014, 30(1): 28-35.
[10] Zhao Hui, Liu Huailiang. Research on Short Text Clustering Algorithm for User Generated Content[J]. 现代图书情报技术, 2013, 29(9): 88-92.
[11] He Wenjing, He Lin. Research on Text Clustering Based on Social Tagging[J]. 现代图书情报技术, 2013, 29(7/8): 49-54.
[12] Xu Xin, Guo Jinlong. Construction of Subject Knowledge Base——Taking the Domain of Chinese Cuisine Culture as an Example[J]. 现代图书情报技术, 2013, (12): 2-9.
[13] Guo Jinlong, Hong Yunjia, Xu Xin. Construction and Application of Ontology in the Domain of Chinese Cuisine Culture[J]. 现代图书情报技术, 2013, (12): 10-18.
[14] Hong Yunjia, Xu Xin. Study on Multi-level Text Clustering for Knowledge Base Based on Domain Ontology——Taking Knowledge Base of Chinese Cuisine Culture as an Example[J]. 现代图书情报技术, 2013, (12): 19-26.
[15] Bian Peng, Zhao Yan, Su Yuzhao. An Improved Method for Determining Optimal Number of Clusters in K-means Clustering Algorithm[J]. 现代图书情报技术, 2011, 27(9): 34-40.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn