Advanced Search
DAKD
Home
Journal Information
Aims and Scopes
Editorial Board
For Authors
Peer-Review Process
Instruction for Authors
Publishing Ethic Statement
Contact Us
中文
Advanced Search
Current Issue
, Volume 29 Issue 9
Previous Issue
Next Issue
For Selected:
View Abstracts
Download Citations
EndNote
Reference Manager
ProCite
BibTeX
RefWorks
Toggle Thumbnails
Select
The Interoperability Needs and Standards Framework for Institutional Repositories
Liang Na, Zhang Xiaolin
2013,
29
(9): 1-7. DOI:
10.11925/infotech.1003-3513.2013.09.01
Abstract
The paper describes the three use scenarios of Institutional Repositories (IR) as knowledge management, knowledge services, and e-Research & e-Learning, emphasizes the need to consider technical, semantic, and management interoperabilities from multiple stakeholders viewpoints, constructs a needs framework for interoperability, and systematically introduces basic, extended, and management standards already in place and in development.
References
|
Related Articles
|
Metrics
Select
Knowledge Organization Tool Catering to Service: Today and Future
Xie Jing, Qian Aibing, Han Pu, Su Xinning
2013,
29
(9): 8-14. DOI:
10.11925/infotech.1003-3513.2013.09.02
Abstract
From the perspective of knowledge service, this paper divides knowledge organization tools into three groups: tools for basic knowledge acquisition and systematization, tools for knowledge relationship establishing, and tools for knowledge processing and visualization. Tools for basic knowledge acquisition and systematization render push services for knowledge elements. Tools for knowledge relationship establishing mainly work on the identification of knowledge relationship and support inference services together with tools for basic knowledge acquisition and systematization. Tools for knowledge processing and visualization are used in the procedure of knowledge extraction, identification and visualization. After the procedure, these tools render user-oriented services by knowledge reorganization. Finally, the paper discusses future trends of knowledge organization tools and points out the characteristics of future tools.
References
|
Related Articles
|
Metrics
Select
Linking and Mapping of Library Catalogue Data Based on MapReduce
Yu Wei, Chen Junpeng
2013,
29
(9): 15-22. DOI:
10.11925/infotech.1003-3513.2013.09.03
Abstract
In this paper, the MARC data is transformed to linked data, based on MapReduce model and MODS Onto-logy. Through the mapping among different linked open data sets, the library catalogue data can become part of the linked open data community and provide efficient semantic data to knowledge discovery and semantic service.
References
|
Related Articles
|
Metrics
Select
Decoding Optimization in Tree Transducer based Translation Model
Shi Chongde, Qiao Xiaodong, Wang Huilin
2013,
29
(9): 23-29. DOI:
10.11925/infotech.1003-3513.2013.09.04
Abstract
This paper proposes two methods to improve the efficiency of rule binarization and decoding in tree transducer based translation model. The authors convert synchronous transducer rules to four kinds of binary rules to reduce the temporary items, and propose RR-CKY decoding algorithm, which can avoid part of redundant items along with decoding. The experiments show that these two methods can reduce the number of temporary items and make decoding faster. They can also improve the quality of machine translation.
References
|
Related Articles
|
Metrics
Select
Study on Keyword Extraction Using Word Position Weighted TextRank
Xia Tian
2013,
29
(9): 30-34. DOI:
10.11925/infotech.1003-3513.2013.09.05
Abstract
The keyword extraction problem is taken as a word importance ranking problem. In this paper,candidate keyword graph is constructed based on TextRank, and the influences of word coverage, location and frequency are used to calculate the probability transition matrix, then, the word score is calculated by iterative method, and the top N candidate keywords are picked as the final results. Experimental results show that the proposed word position weighted TextRank method is better than the traditional TextRank method and LDA topic model method.
References
|
Related Articles
|
Metrics
Select
Identifying Synonyms Based on Sentence Structure Analysis
Yu Juan, Yin Jidong, Fei Shu
2013,
29
(9): 35-40. DOI:
10.11925/infotech.1003-3513.2013.09.06
Abstract
A new method of identifying synonyms is proposed for the purpose of reducing the deviation when calculating the semantic similarity between two different terms or phrases. The method first analyzes sentence structures of the concerned terms (or phrases), and then calculates the semantic similarity between two terms (or phrases) based on Tongyici Cilin (a Chinese thesaurus). This method weights each word in the concerned terms (or phrases) equally to reduce identifying errors made by gravity-centre-backward methods. Experiments show that the proposed method of identifying synonyms is accurate and has good potentials for text mining and semantic retrieval applications.
References
|
Related Articles
|
Metrics
Select
Fast Duplicate Detection for Chinese Texts Based on Semantic Fingerprint
Li Gang, Mao Jin, Chen Jinghao
2013,
29
(9): 41-47. DOI:
10.11925/infotech.1003-3513.2013.09.07
Abstract
Oriented to Chinese texts, text features are firstly extracted to generate semantic fingerprints by performing the Simhash algorithm. The Hamming Distances between semantic fingerprints are applied to determine the similarity between texts. Then, as the last step of the entire process of detecting duplicates for Chinese text, the Single-Pass clustering algorithm is integrated to cluster the generated semantic fingerprints, after which the clusters of fingerprints are the final results. By comparing with the Shingle algorithm, the experiment shows that the Simhash approach is superior at both precise and robustness, and the Simhash approach is capable to process large amount of texts due to its rapidness.
References
|
Related Articles
|
Metrics
Select
Authorship Identification of Chinese UGC Based on Stylistics
Lv Yingjie, Fan Jing, Liu Jingfang
2013,
29
(9): 48-53. DOI:
10.11925/infotech.1003-3513.2013.09.08
Abstract
The characteristics of information network such as openness and virtuality make it difficult for authorship identification. Therefore, this paper proposes the approach of authorship identification of Chinese UGC based on stylistics. The authors integrate four types of features including lexical, syntactic, structural and content-specific features to compose writing-style features, and then use text classification technologies for authorship identification. The experimental results demonstrate that the proposed approach can be used for authorship identification of Chinese UGC efficiently.
References
|
Related Articles
|
Metrics
Select
An Automatic Term Extraction System of Improved C-value Based on Effective Word Frequency
Xiong Liyan, Tan Long, Zhong Maosheng
2013,
29
(9): 54-59. DOI:
10.11925/infotech.1003-3513.2013.09.09
Abstract
Existing Chinese term automatic extraction methods focus on the high-frequency characteristics and unithood indicators of terms, while low frequency terms and termhood indicators lack of effective treatment methods. In response to these problems, this paper introduces the background corpus into C-value method and proposes the concepts of word field distribution degree and effective word frequency. Then the paper automatically extracts the terms by calculating EC-value (Effective C-value) of candidate terms, and improves the extraction performance of low-frequency terms combined with the term cluster recognition and mining. The term extraction experiment in the computer field shows that the proposed improved method (EC-value method) can measure the termhood of terms more effectively, and improve the extraction performance of low-frequency terms.
References
|
Related Articles
|
Metrics
Select
Research on the Credibility of Online Chinese Product Reviews
Meng Meiren, Ding Shengchun
2013,
29
(9): 60-66. DOI:
10.11925/infotech.1003-3513.2013.09.10
Abstract
This paper aims at filtering the lower credible online Chinese product reviews to offer valuable reviews for consumers’ purchase decision. Based on the deep analysis of the online Chinese product reviews’ characteristics, also with some related works, the authors make an empirical analysis on the credibility factors through questionnaires. According to the results of the empirical analysis, the authors select content integrity, emotional balance, review timeliness and clarity of the identity of the publisher as four features, use CRFs as reviews credibility’s classification model, and conduct feature combination experiments to get the best feature combination. The experiments achieve significant results, and the correct rates of the classification model are all above 75%. The research results of this paper can improve the existing artificial effectiveness evaluation method, thus offering new methods and thoughts for optimized filtering of the online reviews.
References
|
Related Articles
|
Metrics
Select
Study on Network Information Ecological Chain of Chinese Shopping Websites
Li Beiwei, Xu Yue, Shan Jimin, Wei Changlong, Zhang Xinqi, Fu Jinxin
2013,
29
(9): 67-73. DOI:
10.11925/infotech.1003-3513.2013.09.11
Abstract
Taking the information ecological chain of Chinese online shopping websites as the research object,this paper establishes an evaluation index system. Selecting 20 shopping websites as example, it grasps the distribution and characteristics of information ecological chain of Chinese online shopping websites through the factor analysis and cluster analysis, and the 20 shopping websites are classified according to the similarity of their development situation. Finally, aiming at the problems existing in the development, corresponding countermeasures and suggestion are put forward.
References
|
Related Articles
|
Metrics
Select
Research on Microblog Ranking Strategy with the Social Relations
Tang Xiaobo, Fang Xiaoke
2013,
29
(9): 74-81. DOI:
10.11925/infotech.1003-3513.2013.09.12
Abstract
The emergence of social media makes the environment of retrieving changed. Since the shortcomings of retrieving ranking in microblog, this paper analyzes the microblogging social network relationship, and proposes microblogging ranking strategy with the social relations. That means, social strength is added to the traditional PageRank ranking algorithm, and some related indicators including people popularity, information popularity, information quality, the time factor and some others are considered. The experimental results show that AVG has a higher accuracy, and it can obtain more social relationships compared with conventional ranking algorithm.
References
|
Related Articles
|
Metrics
Select
Person Name Attribute Knowledge Mining and Its Application for Query Classification
Zhang Mei, Duan Jianyong, Xu Jichao
2013,
29
(9): 82-87. DOI:
10.11925/infotech.1003-3513.2013.09.13
Abstract
There are many name entity queries in the Web logs, and person name queries are more than half of these queries. This paper uses Web logs and Wikipedia information to construct the person name knowledge base for the query recommendation. Firstly the person name entities are mined from Web logs and the attributes of these entities are combined by extracting from Wikipedia. With the help of the person name knowledge, the person names in the user queries are classified by the attribute patterns and statistic methods. Then related attribute knowledge is used to recommend the user Intents. The results show that the person name knowledge can be used effectively in the query classification.
References
|
Related Articles
|
Metrics
Select
Research on Short Text Clustering Algorithm for User Generated Content
Zhao Hui, Liu Huailiang
2013,
29
(9): 88-92. DOI:
10.11925/infotech.1003-3513.2013.09.14
Abstract
To solve the problem of weak semantic description ability of short text feature in user generated content, and the traditional K-means algorithm for document clustering is sensitive to the initial clustering center, this paper proposes that the semantic features information of short text can be supplied by feature extension based on the concept, link structure and category system of Wikipedia. Then the weighted complex network of short text set is built by the semantic relation of texts, and text clustering is achieved by node partitioning community based on K-means algorithm whose initial clustering center is chosen according to the synthetic characteristics of network nodes. Results of experiment show that the algorithm proposed by this paper can improve the effect of short text clustering.
References
|
Related Articles
|
Metrics
Select
A Research of Knowledge Sharing Community Discovery Based on Interaction History Between Peers in P2P Networks
Gao Haiyan, Dou Yongxiang, Qi Yilan
2013,
29
(9): 93-98. DOI:
10.11925/infotech.1003-3513.2013.09.15
Abstract
In this paper, a P2P community discovery method based on interaction history of knowledge sharing is proposed.At first, a research on the generation of interaction history during the knowledge sharing process is conducted,the user interaction network is formed on the basic of interaction history and the similarities between users are calculated. Then, the clustering analysis approach is used to discover the self-organized P2P knowledge sharing community. Finally, an experiment is designed to verify the feasibility and efficiency of the method.
References
|
Related Articles
|
Metrics
Copyright © 2016 Data Analysis and Knowledge Discovery Tel/Fax:(010)82626611-6626,82624938 E-mail:jishu@mail.las.ac.cn