Data Analysis and Knowledge Discovery

Select

Study on General Metadata Application Rules for Digital Library

Shen Yunyun, Xiao Long, Feng Ying

New Technology of Library and Information Service. 2010, 26(12): 1-8. https://doi.org/10.11925/infotech.1003-3513.2010.12.01

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper discusses how to build the general metadata application rules for Chinese digital library. It aims at solving the applications of metadata in Chinese digital library, developing a series of related metadata standards, criteria and platforms, to meet the requirements of describing, organizing, managing, serving a nd preserving the Chinese digital objects. It also gives the metadata application principles and framework, the metadata open and interoperability mechanism, and the metadata application workflows, based on the work of DCMI as well as the other international leading metadata projects. The authors are trying to find the best practice of metadata application for developing digital library in China.

Select

Digitalization Standards and the Applications of Objects Resources

Zhang Chunhong, Tang Yong, Shao Ke

New Technology of Library and Information Service. 2010, 26(12): 9-14. https://doi.org/10.11925/infotech.1003-3513.2010.12.02

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This article introduces some contents and its application of the digitalization standard of the objects resources and its guideline. More attention is paid to metadata standard, the relationship of naming rules and DOI,etc. The authors expect to provide some theoretical and practical references for digitalization of resources in library.

Select

User Behavior Clustering for Creation of Personas

Sun Minjie, Wu Zhenxin

New Technology of Library and Information Service. 2010, 26(12): 15-20. https://doi.org/10.11925/infotech.1003-3513.2010.12.03

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Based on the personas of user modeling in human-computer interaction design, through the analysis of user behavior logs in institutional repository, the authors use K-means clustering method to identify user behavior patterns, classify users group, and create personas-feature matrix quantitative models for institutional repository.

Select

Developing Subject Knowledge Environments Based on Vitro

Liu Yi, Song Wen, Tang Yijie, Yang Rui, Huang Jinxia, Zhou Zijian

New Technology of Library and Information Service. 2010, 26(12): 21-27. https://doi.org/10.11925/infotech.1003-3513.2010.12.04

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper introduces the background of Subject Knowledge Environment (SKE) platform.Then by analyzing the composing and functional characteristic of Vitro, the authors advance the design solution to the SKE platform. The main methods on the localization of Vitro are also expounded.

Select

English Term Extraction Based on Context Analysis & Statistical Characteristic

Xu Deshan, Zhang Zhixiong, Wang Feng, Xing Meifeng

New Technology of Library and Information Service. 2010, 26(12): 28-33. https://doi.org/10.11925/infotech.1003-3513.2010.12.05

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Firstly, the article introduces the basic features of terms, and discusses the automatic identification method of scientific terms. Then V-value is proposed, which improves the two main statistical indicators:TF-IDF and C-value according to text characteristics. Different weights are also set for the candidate terms by the position to show their effect. Finally, a term extraction system is implemented based on statistics and rules. The system combines the weight, C-value and TF-IDF, so it has a higher precision of extraction.

Select

The Subject Extraction Based on Topic Segmentation and PageRank Algorithm

Duan Xiaoli, Wang Yu

New Technology of Library and Information Service. 2010, 26(12): 34-39. https://doi.org/10.11925/infotech.1003-3513.2010.12.06

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Considering the completeness of subject extraction, this paper sorts the sentences with PageRank algorithm based on text theme divisions after reconstructing sentence relation map to every theme package. Then the sentence which has the maximum weight among all the texts is set to be the topics sentence. Experiments show that the topic sentence extraction algorithm has a good coverage of the full text.

Select

Research on Key Behaviors of Image Retrieval on Internet

Cao Mei

New Technology of Library and Information Service. 2010, 26(12): 40-45. https://doi.org/10.11925/infotech.1003-3513.2010.12.07

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Due to the absence of domestic research on image retrieval behavior, this paper designs a user experiment in which image retrieval process is recorded by behavior tracking technology to analyze the key behaviors. Some results on image retrieval strategies, characters and user psychology are discussed from various perspectives such as behavior distribution, browsing or researching, page turning, relevance judgment, and so on. In the end, some suggestions to networked image retrieval systems are provided.

Select

The Method of Patent Data Approximately Duplicate Attributes and Records Detecting Based on IRPU Algorithm

Lei Xiaoping, Zhang Xu, Zhao Yunhua, Zheng Jia

New Technology of Library and Information Service. 2010, 26(12): 46-51. https://doi.org/10.11925/infotech.1003-3513.2010.12.08

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Oriented to patent data fields, taking the characteristics of patent document and the requirement of patent analysis into account, this paper puts forward an improved method of patent data approximately duplicate attributes and records detecting based on RFMA algorithm and PCM algorithm, which is IRPU algorithm. Then IRPU algorithm is applied in patent data to detect inventor attribute and whole record. Experimental comparison with the previous work indicates that the proposed method is fit for patent data field and the identification accuracy is higher.

Select

A Tentative Study of Disjoint Literature Discovery Based on Transitive Closure ——Take Cancer Drug Target for Example

Yang Yuan, Gao Liubin

New Technology of Library and Information Service. 2010, 26(12): 52-57. https://doi.org/10.11925/infotech.1003-3513.2010.12.09

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Based on the principle of disjoint literature knowledge discovery,transitive closure in discrete mathematics is applied to find potential associations among drug targets,which confirms that transitive closure based disjoint literature knowledge discovery is achievable and effective. What’s more,the paper makes the original three-step model to multi-step knowledge discovery model,which can get more potential associations but ensure relative high precision and high recall at the same time.

Select

Study on Noun Phrase of “N₁ +N₂”Structure in Search Engine Query Logs

Liu Zhijie, Lv Xueqiang, Cheng Tao

New Technology of Library and Information Service. 2010, 26(12): 58-63. https://doi.org/10.11925/infotech.1003-3513.2010.12.10

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Based on query logs, comprehensive description of the “N₁+N₂” structure noun phrase form is given according to the characteristics of corpus itself,including the characteristics of each element and syntactic function.And the basic methods of mining and proofreading are given about the type of noun phrase. Through the analysis of experimental results, the authors further illustrate that the study of phrase is important in search engine.

Select

Application of Knowledge Discovery Based on Wanfang Data (2003-2007)

Xie Jing, Jiang Lan, Wang Dongbo, Su Xinning

New Technology of Library and Information Service. 2010, 26(12): 64-69. https://doi.org/10.11925/infotech.1003-3513.2010.12.11

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

The paper makes an association analysis on authors, affiliations and documents based on the data of the papers published in Chinese periodicals from Wanfang Data(2003-2007). This helps to indicate the latent relationships among authors, affiliations and documents. An effective method of entity recognition is also proposed to improve the accuracy of association analysis in this application. And the application is supposed to be the basis of further semantic retrieval.

Select

Design and Implementation of Mobile Search Service for Heterogeneous Electronic Resources Based on MetaLib X-Server

Zhang Bei, Dou Tianfang, Zhang Chengyu, Qi Pizhi

New Technology of Library and Information Service. 2010, 26(12): 70-75. https://doi.org/10.11925/infotech.1003-3513.2010.12.12

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

In order to fully utilize the library collections, Tsinghua University Library opens a new approach for mobile services with the help of its past work on search integration of electronic resources. Tsinghua University Library implements the mobile search service for heterogeneous electronic resources based on MetaLib system and its X-Server interface. It is composed of 3 key components: UI customization, search service and status monitoring, and provides continuously available retrieving service for heterogeneous resources to the mobile users. This paper illustrates the technical implementations of the key components in detail.

Select

Research on Automatic Archiving System for Institutional Repositories

Cui Yuhong

New Technology of Library and Information Service. 2010, 26(12): 76-80. https://doi.org/10.11925/infotech.1003-3513.2010.12.13

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper introduces an experimental system (DAAS) which can automatic harvest the institutional researcher articles and ingest the metadata into the local DSpace platform. The system implements a semi-automatic approach for IRs population which consists of information filtering, metadata extraction, copyright verification, metadata mapping and data archiving. Based on Nutch key component, how to parse the URL and extract the metadata from unstructured Web pages according to the rule-based filter is described in detail. The next research is focus on the computer-learning algorithm.

Select

Design of the Reading System for Card Identity Authentication Based on Image Recognition

Zhu Guang, Yang Yongyue

New Technology of Library and Information Service. 2010, 26(12): 81-85. https://doi.org/10.11925/infotech.1003-3513.2010.12.14

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This article briefly discusses the situation of identity authentication research. In connection with the traditional smart card, it especially represents the method of using high-speed CCD to get card images without human intervention, then uses effective image manipulations and pattern recognition methods to read the card information. Finally it compares the cardholder’s identity information with the database, reaching the purpose of identity authentication.

Select

Programming Assisted Tools of ILAS by E-language

Zhu Yuqiang

New Technology of Library and Information Service. 2010, 26(12): 86-88. https://doi.org/10.11925/infotech.1003-3513.2010.12.15

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This article combines with actual work of Library of Shandong Normal University and presents the ways to program assisted tools for ILAS. The tools automatically close messagebox of ILAS, make the query module of ILAS support automatic and continuous work. These tools can improve the efficiency of librarians.

Please choose a citation manager

Content to export

25 December 2010, Volume 26 Issue 12

模态框（Modal）标题

Please choose a citation manager

Content to export

25 December 2010, Volume 26 Issue 12