Data Analysis and Knowledge Discovery

Select

Usert-Driven Construction in National Science Library Website

Wu Zhenxin,Zhang Zhixiong,Zhang Xiaolin,Liu Xiwen

New Technology of Library and Information Service. 2008, 24(3): 1-6. https://doi.org/10.11925/infotech.1003-3513.2008.03.01

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

After elaborating user difficulties with the website of the National Science Library (NSL), the authors present the principles of library website development such as user-driven, simplicity, convenience, integration and modularity, design the service layer based on user information processes and the supporting layer with a service knowledge base and an SRU encapsulating engine, develop some functions such as integrated resource search, user-driven process composition, process-driven service navigation and context-sensitive help information.

Select

Approaches to implement Services-embedded Desktop Information Tools

Le Xiaoqiu,Li Yu,Zhang Xiaolin,Zhang Zhixiong,Li Chunwang

New Technology of Library and Information Service. 2008, 24(3): 7-11. https://doi.org/10.11925/infotech.1003-3513.2008.03.02

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper introduces a services-embedded desktop information system, which puts science and technology literature services to desktop by tracing user’s operation and his ongoing workflow scene. The paper also provides design ideas and implementation approaches of the system.

Select

Survey of Cognition and Requirement of Service on Institutional Repository in Academy

Han Ke,Zhu Zhongming

New Technology of Library and Information Service. 2008, 24(3): 12-17. https://doi.org/10.11925/infotech.1003-3513.2008.03.03

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Based on the questionnaire of “cognition and requirement of IR ” done by the researchers working in the Chinese Academy of Sciences,the authors make an analysis of actuality of the institutional repository in academy. Then the authors point out the deficiency and plan in the next step of work.

Select

An ACE View of the Development Tendency of Information Extraction Technology

Zhao Qi,Liu Jianhua,Feng Haoran

New Technology of Library and Information Service. 2008, 24(3): 18-23. https://doi.org/10.11925/infotech.1003-3513.2008.03.04

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper introduces the general situation and the track of development of ACE. According to the change of evolution task, teams, corpus and results, it analyzes the state-of-the-art development on information extraction and then gives out some thoughts concerning future directions.

Select

A Survey of the Research on Information Extraction over Web Tables

Zhao Hong,Xiao Hong,Xue Dejun,Shi Qinghui

New Technology of Library and Information Service. 2008, 24(3): 24-31. https://doi.org/10.11925/infotech.1003-3513.2008.03.05

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper firstly introduces the characteristics and structure of Web tables and describes the process of information extraction over Web tables. Then four key technologies are analysed, including Web table detection, Web table structure recognition, Web table interpretation and presentation of table extraction. It also analyses the application of the research and points out the problems in current researches, and finally presents a prospect of its future.

Select

Research on Calculation of Trust Degree and Design of Trust Transfer Protocol

Xu Xin,Zhai Xiaojuan

New Technology of Library and Information Service. 2008, 24(3): 32-39. https://doi.org/10.11925/infotech.1003-3513.2008.03.06

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper presents a method of calculating trust degree between entities in web environment. It also designs a sort of trust transfer protocol, which uses XML to present message and XML based encryption protocol.

Select

Rule-based Automatic Annotating for the Discourse of English Complicated Sentences

Shen Chunyan,Wang Huilin

New Technology of Library and Information Service. 2008, 24(3): 40-44. https://doi.org/10.11925/infotech.1003-3513.2008.03.07

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper introduces the technology of Finite State Transducer, and references to the thinking of development of Penn Treebank, through the analysis of rules and the results of comprehensive utilization of POS tagging, recognition of discourse connectives,punctuations, vocabulary mapping, and chunk to simplify the complicated sentences. Final results are expressed in the form of proposition.

Select

Research of large-scale URL Filter Base on Bloom Filter

Ding Zhenguo,Wu Baogui,Xin Youqiang

New Technology of Library and Information Service. 2008, 24(3): 45-50. https://doi.org/10.11925/infotech.1003-3513.2008.03.08

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

On the condition of error allowing, the Bloom Filter and its improvable algorithm, can be used to filter the homology URL pages through URL Hashing. Experiment shows that it can achieve satisfactory results through reasonable adjustments of its parameter.

Select

An Algorithm for Noise Reduction in Web Pages Based on a Group of Content-related Rules

Wang Jiandong,Wang Jimin,Tian Feijia

New Technology of Library and Information Service. 2008, 24(3): 51-54. https://doi.org/10.11925/infotech.1003-3513.2008.03.09

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper presents a new algorithm for the Elimination of Noise in Web Pages Based on a Group of Content-related rules. First, we present an algorithm which can peel off noises by iteratively comparing the tables on the same level of the page’s table tree. Next, we present an algorithm in order to evaluate anchor text’s topic similarity to the content of the page. To some extent, as the new algorithm takes semantic facts of the pages into consideration, it acquires a even higher accuracy than pure rule-based algorithms, and requires a fairly low time complexity. The experiment indicates that this algorithm performs very effectively when purifying great mass of web pages.

Select

An Algorithm for Detecting Duplicated Chinese Web News Based on Suffix Tree

Qian Aibing,Jiang Lan

New Technology of Library and Information Service. 2008, 24(3): 55-61. https://doi.org/10.11925/infotech.1003-3513.2008.03.10

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

In view of the shortcomings of traditional methods for analyzing public opinions, this paper proposes a new idea of public opinion analysis under the Web,and then designs a model for it. Experiments show that the proposed model is an effective solution to analyzing public opinion under the Web.

Select

Design and Realization of Knowledge Element Automatic Extraction of Network Special Subject Knowledge Organization

Tan Chunmei,Yan Shiwei,Liu Zimu

New Technology of Library and Information Service. 2008, 24(3): 62-67. https://doi.org/10.11925/infotech.1003-3513.2008.03.11

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

With visual studio.NET development platform，C# program design language，XML knowledge description and data storage，knowledge element automatic extraction system of network special subject knowledge organization has been designed and developed. The design and development of main functions such as text information pretreatment，fast self increasing word segmentation of connection patterns of Chinese characters，accurate statistics of full text of word frequency etc of the system have been researched.

Select

Research on Non-Plain Text Visualization Based on MathML

Yang Zhiqin

New Technology of Library and Information Service. 2008, 24(3): 68-72. https://doi.org/10.11925/infotech.1003-3513.2008.03.12

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

It is very difficult for Librarians to process mass of non-plain text information on web. The problems concerning input and output in math formula and some special data were solved based on MathML in the present study. In this way, information retrieval and utilization are carried out. The research results can provide a new method for non-plain text visualization such as math formula on web.

Select

Study About Government Ontology Construction Based on the E-government Thesauri

Zhao Dongxia,Zhao Xinli

New Technology of Library and Information Service. 2008, 24(3): 73-77. https://doi.org/10.11925/infotech.1003-3513.2008.03.13

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Studying the development and application of ontology at home and abroad and referring to the achievement of E-Government Thesauri, this paper brings forward the method of E-government ontology construction and gives the demo.

Select

The Improvement in a Chinese Word Segmentation Based on Hash Algorism

Yao Xingshan

New Technology of Library and Information Service. 2008, 24(3): 78-81. https://doi.org/10.11925/infotech.1003-3513.2008.03.14

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

A new algorithm for Chinese word segmentation is introduced in this paper, which is based on the new data structure for Chinese dictionary. Theory and experimets show that the above data structure achieves much more efficiency.

Select

Thinking and Realization of Web MARC Editor Based on Ajax

Su Dongchu

New Technology of Library and Information Service. 2008, 24(3): 82-85. https://doi.org/10.11925/infotech.1003-3513.2008.03.15

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper discusses the reason of difficulty of migrating MARC editor from coventional C/S mode to B/S mode, and provides a method of how to resolve the problem by using Ajax.

Select

Personal Name Identification in the Practice of Digital Repositories

Jingfeng Xia

New Technology of Library and Information Service. 2008, 24(3): 87-94. https://doi.org/10.11925/infotech.1003-3513.2008.03.16

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Purpose-To propose improvements to the identification of authors’ names in digital repositories.
Design/methodology/approach-Analysis of current name authorities in digital resources,particularly in digital repositories,and analysis of some features of existing repository application.
Findings-This paper finds that the variations of authors’ names have negatively affected the retrieval capability of digital repositories.Two possible solutions include using composite identifiers that variants of their name, if any, at the time of depositing articles.
Originality/value-This is the first time that the approach of authors self-depositing their name variations is proposed. This approach will be able to reduce confusions in name identification.

Select

DINI Institutional Repository Certification and Beyond

Susanne Dobratz,Frank Scholze,Wu Xia(Compile)

New Technology of Library and Information Service. 2008, 24(3): 95-102. https://doi.org/10.11925/infotech.1003-3513.2008.03.17

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Purpose:Overview on certification of institutional repositories as a means to support Open Access in Germany and description of the DINI Certificate 2006 developed by DINI, the German Initiative for Networked Information.
Design/methodology/approach: The DINI Certificate for Document and Publication Repositories shows potential users and authores of digital documents that a certain level of quality in operating the repository is guaranteed and that this distinguishes it from commun insitutional Web servers.The Certificate can also be used as an instrument to support Open Access.
Findings: Repository certification will not be the main factor in achieving open access to academic information globally, but it can support the spread of institutional repositories and enhance visibility of the “Institutional Repository”-service.
Research limitations/implications: The DINI Certificate as a “soft” certificate aims towards interpoperability of digital repositories,the coaching idea prevails.It does not provide an exhaustive auditing tool for trusted digital long term preservation archives.
Practical implications: The DINI Certificate for Document and Publication Repositories pushed the development of institutional repositories in Germany according to certain organisational and technical standards and contributes to the interoperability amongst digital repositories worldwide.
Originality/value: This paper describes anique approach that has been implemented in Germany and could be transferred to other countries and communities.

Please choose a citation manager

Content to export

25 March 2008, Volume 24 Issue 3

模态框（Modal）标题

Please choose a citation manager

Content to export

25 March 2008, Volume 24 Issue 3