Data Analysis and Knowledge Discovery

Select

The Comparative Analysis of Major Domestic and Foreign Ontology Library

Bai Rujiang, Yu Xiaofan, Wang Xiaoyue

New Technology of Library and Information Service. 2011, 27(1): 3-13. https://doi.org/10.11925/infotech.1003-3513.2011.01.02

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

The paper introduces the major general Ontology libraries in domestic and foreign: WordNet、DBpedia、Cyc and HowNet, and the successful professional domain Ontology libraries: Biomedical Ontology and Enterprise Ontology. Then it separately compares and analyzes them from five aspects as the description language, storage mode, query language, platform building and application to provide assistance for the study in Ontology library and its application.

Select

Review on the Methods and Tools for Ontology Integration

Yu Xiaofan, Wang Xiaoyue, Bai Rujiang

New Technology of Library and Information Service. 2011, 27(1): 14-21. https://doi.org/10.11925/infotech.1003-3513.2011.01.03

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Ontology integration is a process that can eliminate Ontology heterogeneous, so as to achieve the highest level of semantic communication and semantic integration, and finally achieve knowledge reuse and interoperability. The paper reviews the four main methods and the five main tools for Ontology integration, and gives some comparative analysis.

Select

Study on the Mapping Mechanism Between WordNet and SUMO Ontology

Wang Xiaoyue, Hu Zewen, Bai Rujiang

New Technology of Library and Information Service. 2011, 27(1): 22-30. https://doi.org/10.11925/infotech.1003-3513.2011.01.04

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

To solve the existing contradiction of generality and speciality between Ontology concepts and natural language words,this paper takes WordNet thesaurus and SUMO Ontology as research objects, makes a simple introduction of them, detailedly analyzes the mapping motivations between them, proposes a mapping model among natural language words, WordNet synsets and SUMO Ontology concepts, and deeply analyzes the mapping instances, the mapping effects and applications between WordNet synsets and SUMO Ontology concepts. The authors hopes to better utilize the mapping relations between WordNet and SUMO to solve the contradiction between Ontology concepts and natural language words, and make Ontology have a more widely application in intelligent retrieval, semantic classification and data mining etc.

Select

Study on Text Classification Model Based on SUMO and WordNet Ontology Integration

Hu Zewen, Wang Xiaoyue, Bai Rujiang

New Technology of Library and Information Service. 2011, 27(1): 31-38. https://doi.org/10.11925/infotech.1003-3513.2011.01.05

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Aiming at the existing problems in the traditional text classification methods and the current semantic classification methods, a new text classification model based on SUMO and WordNet Ontology integration is proposed. This model utilizes the mapping relations between WordNet synsets and SUMO Ontology concepts to map terms in document-words vector space into the corresponding concepts in Ontology, and forms document-concepts vector space to classify texts automatically. The experiment results show that the proposed method can greatly decrease the dimensionality of vector space and improve the text classification performance.

Select

Research Review of Related Articles Retrieval

Wang Junhui, Hu Tiejun, Li Danya

New Technology of Library and Information Service. 2011, 27(1): 39-45. https://doi.org/10.11925/infotech.1003-3513.2011.01.06

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper classifies the related articles retrieval from the perspective of bibliometrics, analyzes the key technologies involved in the process of implementation, and focuses on the text similarity computation algorithm, main research course and recent progress in the system of PubMed and CBM. Based on outlining the evaluation methods and indicators, the paper analyzes the effectiveness of related articles retrieval from both positive and negative aspects.Finally, it discusses the development direction of related articles retrieval.

Select

Model of Non-user-A Methodology for Information System Performance

Ku Liping

New Technology of Library and Information Service. 2011, 27(1): 46-51. https://doi.org/10.11925/infotech.1003-3513.2011.01.07

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Designing system functions to meet the non-user behavior can increase system utilization.This article systematically introduces the relevant studies and non-user types, then proposes the non-user theory, discusses using scenerios analysis, personas and living labortory to hold the non-user behavior. It can be a guideline to enhance and to improve the digital library service system.

Select

Query Expansion of Pseudo Relevance Feedback Based on Feature Terms Extraction and Correlation Fusion

Feng Ping, Huang Mingxuan

New Technology of Library and Information Service. 2011, 27(1): 52-56. https://doi.org/10.11925/infotech.1003-3513.2011.01.08

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Aiming at the term mismatch issues of existing information retrieval systems, a novel query expansion algorithm of pseudo relevance feedback is proposed based on feature terms extraction and correlation fusion. At the same time, a new computing method for weights of expansion terms is also given. The algorithm can extract feature terms related to original query from the n chapter top-ranked retrieved local documents, and then identify those feature terms as final expansion terms according to the frequency of each feature term appeared in the local documents and the correlation between each feature term and the entire original query for query expansion. The results of the experiment show that the method is effective,and it can enhance and improve the performance of information retrieval.

Select

Study on Web Retrieval Query Fusion Based on Relevance Feedback

Jing Jing, Hong Ying, Jiang Yuanyuan, Gao Xiaofeng

New Technology of Library and Information Service. 2011, 27(1): 57-62. https://doi.org/10.11925/infotech.1003-3513.2011.01.09

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper introduces the combination of query fusion and relevance feedback methods.By analyzing previous TopN documents selection strategy, it puts forward a query fusion algorithm using correlation coefficient to select a variable number of TopN documents in order to extend query, which is called variable TopN feedback-based query fusion algorithm. Fixed and variable TopN query fusion experiments are analyzed separately, and the test results show that the variable TopN feedback method improves the retrieval performance to some extent.

Select

Hot Research Topics Detection Based on SOM

Lu Wei, Peng Yu, Chen Wu

New Technology of Library and Information Service. 2011, 27(1): 63-68. https://doi.org/10.11925/infotech.1003-3513.2011.01.10

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

According to detection of hot topics in a research field, the paper proposes a method combining co-word analysis and SOM together. By analysing the co-occurrence of high-frequency keywords in the literature as input data and using SOM Toolbox for SOM clustering, the collection of hot research topics is obtained.At last a case study is done by taking traditional medicine as an example, and experimental results show that this method is efficient in the process of hot research topics detection.

Select

Research on Content Characteristics About Complex Network of Text

Liu Honghong, An Haizhong, Gao Xiangyun

New Technology of Library and Information Service. 2011, 27(1): 69-73. https://doi.org/10.11925/infotech.1003-3513.2011.01.11

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

To solve the problem of irregular structure of some texts, this paper presents a method based on the complex network theory to evaluate the text structure. This method uses a node to represent a sentence and an edge between two nodes to represent a common word of two sentences, which construct the complex network of a text. Then the authors analyze characters of text structure by topological characteristics of text complex network. By building a text complex network based on a selected article, the degree, the degree of intensity, the shortest paths and the weighting clustering coefficients of this selected article are calculated. The results show that the structure of the text content can be effectively evaluated by this proposed method. Moreover, the results also provide important references to understand main ideas, to generate summaries and to filter text retrieval of a given text.

Select

A Comparison and Evaluation Experiment on Chinese and English Online Question Answering Communities

Wu Dan, Liu Yuan, Wang Shaocheng

New Technology of Library and Information Service. 2011, 27(1): 74-82. https://doi.org/10.11925/infotech.1003-3513.2011.01.12

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper gives comparisons of twelve Chinese and English Q&A communities from basic information, interaction, and personalized service. Q&A experiment on four types of questions in three fields is also conducted to evaluate those communities from the quality and efficiency of answering questions, etc. Research results give some advices on the development strategies of Q&A community.

Select

Evolution of Topics About Medical Informatics by Improved Co-word Cluster Analysis

Yang Ying, Cui Lei

New Technology of Library and Information Service. 2011, 27(1): 83-87. https://doi.org/10.11925/infotech.1003-3513.2011.01.13

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Co-word cluster method is improved by following ways: high-frequency words are selected according to the formula derived from Zipf’s law; adhesive force is used to identify the core major MeSH words for tagging the content of each cluster; contrastive analysis of two periods helps to find the topics change. The bibliographic data of medical informatics are collected from PubMed in two periods (1999-2003 and 2004-2008). Major MeSH words from the articles are extracted separately to make co-word clusters as to explore the evolution of this subject structure based on comparison of two periods.

Select

CWM-based ETL Metadata System Model Design

Zhou Jing, Zhao Ying, Yang Xin

New Technology of Library and Information Service. 2011, 27(1): 88-93. https://doi.org/10.11925/infotech.1003-3513.2011.01.14

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

In order to prevent and control the existing mismanagement problems in ETL, and ensure the efficient implementation of the data warehouse, the paper designs CWM-based ETL metadata system model. This model can describe the specific steps of data transform, and the specific modules are designed according to this model design system, thus the process of ETL management can be achieved effectively.

Select

Application of the Fuzzy Rule Algorithm in the Classification of Educational Information

Liang Wenchao, Xu Chaojun, Shen Shusheng

New Technology of Library and Information Service. 2011, 27(1): 94-98. https://doi.org/10.11925/infotech.1003-3513.2011.01.15

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Because of the fact that the introductions of primary and secondary schools have less feature items and unequal weights, the authors use the strategies of denoising, processing synonym features based on fuzzy set to build category vocabularies, and then classify short texts using the classification model which is based on category vocabularies and fuzzy rules. The results show that using fuzzy rules to classify the short texts which have less feature items and uneven distribution of weight is better than VSM, Rocchio and other classification algorithms.

Please choose a citation manager

Content to export

25 January 2011, Volume 27 Issue 1

模态框（Modal）标题

Please choose a citation manager

Content to export

25 January 2011, Volume 27 Issue 1