Data Analysis and Knowledge Discovery

Select

Open Access, Open Knowledge, and Open Innovation Pushes for Open Knowledge Services——3O Convergence and a New Paradigmatic Shift for Research Libraries

Zhang Xiaolin

New Technology of Library and Information Service. 2013, 29(2): 1-10. https://doi.org/10.11925/infotech.1003-3513.2013.02.01

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Scientific information quickly becomes open access and then openly computable knowledge, the Internet supports dynamic and robust open innovation. The convergence of open access, open computable knowledge, and open innovation gives knowledge-service institutions like libraries a great opportunity to support user-driven knowledge service innovation. Research libraries should develop open resources, open knowledge tools, and open innovation support mechanisms to enable this.

Select

Visualization Implementation of Relation Discovery Based on Linked Data

Hong Na, Qian Qing, Fan Wei, Fang An, Wang Junhui

New Technology of Library and Information Service. 2013, 29(2): 11-17. https://doi.org/10.11925/infotech.1003-3513.2013.02.02

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This article implements a visualization exploration and research of relation discovery based on linked data. The authors investigate current RDF visualization tools and compare them in multiple views, choose kinds of biomedical datasets to construct biomedical linked data and apply RelFinder to implement a biomedical semantic relation discovery system. At last, the insufficiency of this system and the future research direction are discussed.

Select

The Comparative Analysis of Major Provenance Vocabularies in Linked Data Environment

Ni Jing, Meng Xianxue

New Technology of Library and Information Service. 2013, 29(2): 18-23. https://doi.org/10.11925/infotech.1003-3513.2013.02.03

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper discusses the provenance vocabularies: DCMI Term,OPM-O,PV,VoIDP and PROV-O,then separately compares and analyzes the similarities and differences between them from five dimensions: aim, description, service providing,annatation method and vocabulary structure.The authors aim to enable the provenance research community to move towards the adoption and consumption of provenance vocabularies in linked data environment.

Select

Chinese Term Extraction Based on Improved C-value Method

Hu Apei, Zhang Jing, Liu Junli

New Technology of Library and Information Service. 2013, 29(2): 24-29. https://doi.org/10.11925/infotech.1003-3513.2013.02.04

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

An improved C-value term extraction method is introduced in the paper. Firstly, the domain-specific text corpora is preprocessed by stop word list. Secondly, a term extraction algorithm based on the co-occurrence frequency of multi-character is applied to get candidate terms. Lastly, term selection is completed based on termhood computed by IC-value which is the improvement of C-value in terms of inverse document frequency, meaningless substring and term length. Empirical study is conducted based on 1 000 abstracts of articles about Hepatitis B. The results indicate the proposed IC-value is much better than C-value, TF-IDF and V-value in both precision and recall. And IC-value also has good performance in long term extraction and it is very effective in filtering meaningless substring.

Select

An Algorithm of Short Text Classification Based on Semi-supervised Learning

Zhang Qian, Liu Huailiang

New Technology of Library and Information Service. 2013, 29(2): 30-35. https://doi.org/10.11925/infotech.1003-3513.2013.02.05

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

According to the characteristics of short texts and the bottleneck problem of annotation in dealing with large numbers of unlabeled samples, traditional algorithms of text classification can not be used directly. This paper introduces a method of short text classification based on semi-supervised learning and builds a semi-supervised classification model. It is feasible to accomplish the self-training of the training samples and takes full advantages of the unlabeled parts of training texts by using the initial classifier. The bottleneck problem of annotation is solved and the good performance of classifier is shown. The contrast experiment shows that the algorithm of short text classification based on semi-supervised learning can get better classified effect.

Select

An Ontology Mapping Method Based on Lexical Similarity Calculation

Xu Jian, Fang An, Hong Na

New Technology of Library and Information Service. 2013, 29(2): 36-42. https://doi.org/10.11925/infotech.1003-3513.2013.02.06

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Ontology mapping is one of the solutions to Ontology heterogeneity problem. To solve problems that still exist in current concept similarity calculation algorithms, the paper puts forward an improved method,which introduces the synonym/homonym search and edit distance algorithm into the process of term head words similarity calculation. The new automatic weight assign method is also used to integrate the similarity values of head words and modify words. Compared to the other classic Ontology mapping method of the same type, it is proved the improved method has better effects.

Select

Automatic Abstracting Generating Based on Mobile Short Message Text Information Flow

Liu Jinling, Ni Xiaohong, Wang Xingong

New Technology of Library and Information Service. 2013, 29(2): 43-49. https://doi.org/10.11925/infotech.1003-3513.2013.02.07

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Due to the characteristics of mobile short message text information flow in the practical application,an automatic digest generation model is designed. The model uses word co-occurrence to define the semantic similarity. Using the TF-IDF,weights of feature words and abstracts candidate sentence weights are defined in the model. By removing isolated points, the algorithm generates smaller redundancy and more readable short text messages flow digest according to the weight screening abstract and abstract sort. Experiments of the relevant data show that the model has better quality and higher efficiency in abstract generation.

Select

Web Usage Mining Using Reduction of Knowledge Granule

Zhao Jie, Mo Zan, Liu Hongwei, Zhang Shaqing, Dong Zhenning

New Technology of Library and Information Service. 2013, 29(2): 50-56. https://doi.org/10.11925/infotech.1003-3513.2013.02.08

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper proposes multi-granularity Web user behavior description model using granular theory, then the reduction algorithm based on knowledge granule is applied for the data. The experiment results prove that the model can not only descript multi-granularity user behavior characteristics, but also have the effect of horizontal dimension reduction. And efficient vertical dimension reduction is achieved by the reduction algorithm, which effectively reduce the work in the subsequent pattern analysis.

Select

Research on Chinese Micro-blog Bursty Topics Detection

Wang Yong, Xiao Shibin, Guo Yixiu, Lv Xueqiang

New Technology of Library and Information Service. 2013, 29(2): 57-62. https://doi.org/10.11925/infotech.1003-3513.2013.02.09

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Much attention is paid to mining bursty topics accurately and efficiently from micro-blog nowadays. In this paper, a set of burst terms are extracted by counting the term frequency, calculating the growth rate of the terms and using Term Frequency-Proportional Document Frequency (TF-PDF) algorithm to measure the weight. And then micro-blog texts are described with the burst terms. Analyzing the characteristic that bursty topics propagate in the platform of micro-blog, the authors filter the texts that do not contribute to detect bursty topics. The paper proposes a novel clustering strategy of “Absolute Clustering” to cluster the micro-blog texts. By figuring up the hot spot of the texts with weighted value of reply and retweet number, the top 5 texts are extracted as the result of burst topics detection. The experiments show that the precision is 92.60%, the recall is 85.51% and the F-measure is 0.89. Contrast with the traditional method, the validity of the proposed method is proved.

Select

Brand Scandal Spillover Monitor Index System Research Based on Micro-blog

Yu Weiping, Yang Yufeng

New Technology of Library and Information Service. 2013, 29(2): 63-69. https://doi.org/10.11925/infotech.1003-3513.2013.02.10

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper analyses the process of information transmission and the micro-blog on a variety of functions based on brand scandal spillover phenomenon in micro-blog. Using I-space model, it builds the brand scandal in the spillover micro-blog spread the monitoring index system which is composed of publisher index,information index, audience index, diffusion index, and uses the AHP method and fuzzy evaluation method to determine how to compare brand scandal on different competitive brand of spillover, that helps enterprises' manager identify the brand scandal spillover phenomenon and predict the risk in a scientific way.

Select

Design and Implementation of Multi-language Interface in SULCMIS OPAC

Hu Zhenning, Yang Wei, Ding Pei, Lin Weiming, Wu Yuanye

New Technology of Library and Information Service. 2013, 29(2): 70-76. https://doi.org/10.11925/infotech.1003-3513.2013.02.11

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

The internationalization of the software interface is the basic requirement for modern software. In the process of developing new OPAC, Shenzhen University Library Computer Management Integrated System (SULCMIS) builds different language resource files and loads them dynamically to achieve the internationalization of OPAC. The paper describes the implementation process of the SULCMIS OPAC multi-language system interface, including designing idea, system architecture, core module designing, system workflow, language resource files etc.

Select

Design and Implementation of Role Management in Collaborative System

Li Yazi, Sun Haixia, Jiang Jun, Qian Qing

New Technology of Library and Information Service. 2013, 29(2): 77-81. https://doi.org/10.11925/infotech.1003-3513.2013.02.12

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper researches resources management model of access control based on resource-user metric, and functions of user management in collaborative systems,then defines some concepts, such as role, right, task, and designs modules of role management subsystem, and also details logic process of above role management controlling to access resources. Finally,it develops prototype system of role management, according to needs of the projects, assigns 4 types of roles, and applies the roles to the system of constructing science knowledge organization system.

Select

Design and Implementation of the Software to Auto-generate Sci-tech Novelty Search Report

Li Guangli, Li Shuning

New Technology of Library and Information Service. 2013, 29(2): 82-87. https://doi.org/10.11925/infotech.1003-3513.2013.02.13

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

According to the model and format of sci-tech novelty search report, the authors design the software to generate sci-tech novelty search report automatically. Based on analyzing the function demand, the paper puts forward the general design of the software, using C# in the Visual Studio environment. And the realization of interface, documentation, and codes are described in detail. The software improves the efficiency of the sci-tech novelty search, by the functions such as text automatic generation according to the standard format, search terms automatic extraction, database directly selection, etc.

Select

Design and Implementation of Online Patent Analysis System Based on Solr

Liu Chunjiang, Liu Danjun, Wen Yi

New Technology of Library and Information Service. 2013, 29(2): 88-92. https://doi.org/10.11925/infotech.1003-3513.2013.02.14

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

In order to fulfill patent analyst and technical professionals' demand for real-time and online searching and analyzing patent information, this paper implements an online patent analysis system based on Solr. This paper also describes the system's architecture, designs appropriate index fields for system functions and various patent analysis indicators, introduces the system's function modules like patent search, subject management and patent analysis respectively, and displays various visual graphics of analyzed data. Tests and results of application show that the system can support patent analyst and technical professionals in analyzing patent information productively and fast.

Please choose a citation manager

Content to export

25 February 2013, Volume 29 Issue 2

模态框（Modal）标题

Please choose a citation manager

Content to export

25 February 2013, Volume 29 Issue 2