Data Analysis and Knowledge Discovery

Select

Making a Library Catalogue Part of the Semantic Web

Martin Malmsten

New Technology of Library and Information Service. 2009, 3(3): 3-7. https://doi.org/10.11925/infotech.1003-3513.2009.03.02

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Library catalogues contain an enormous amount of structured, high-quality data, however, this data is generally not made available to semantic Web applications. In this paper we describe the tools and techniques used to make the Swedish Union Catalogue (LIBRIS) part of the Semantic Web and Linked Data. The focus is on links to and between resources and the mechanisms used to make data available, rather than perfect description of the individual resources. We also present a method of creating links between records of the same work.

Select

LCSH, SKOS and Linked Data

Ed Summers,Antoine Isaac,Clay Redding,Dan Krech

New Technology of Library and Information Service. 2009, 3(3): 8-14. https://doi.org/10.11925/infotech.1003-3513.2009.03.03

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

A technique for converting Library of Congress Subject Headings MARCXML to Simple Knowledge Organization System (SKOS) RDF is described. Strengths of the SKOS vocabulary are highlighted, as well as possible points for extension, and the integration of other semantic web vocabularies such as Dublin Core. An application for making the vocabulary available as linked data on the Web is also described.

Select

Encoding Application Profiles in a Computational Model of the Crosswalk

Carol Jean Godby,Devon Smith,Eric Childress

New Technology of Library and Information Service. 2009, 3(3): 15-22. https://doi.org/10.11925/infotech.1003-3513.2009.03.04

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

OCLC’s Crosswalk Web Service (Godby, Smith and Childress, 2008) formalizes the notion of crosswalk, as defined in Gill,et al. (n.d.), by hiding technical details and permitting the semantic equivalences to emerge as the centerpiece. One outcome is that metadata experts, who are typically not programmers, can enter the translation logic into a spreadsheet that can be automatically converted into executable code. In this paper, we describe the implementation of the Dublin Core Terms application profile in the management of crosswalks involving MARC. A crosswalk that encodes an application profile extends the typical format with two columns: one that annotates the namespace to which an element belongs, and one that annotates a ‘broadernarrower’ relation between a pair of elements, such as Dublin Core coverage and Dublin Core Terms spatial. This information is sufficient to produce scripts written in OCLC’s Semantic Equivalence Expression Language (or Seel), which are called from the Crosswalk Web Service to generate production-grade translations. With its focus on elements that can be mixed, matched, added, and redefined, the application profile (Heery and Patel, 2000) is a natural fit with the translation model of the Crosswalk Web Service, which attempts to achieve interoperability by mapping one pair of elements at a time.

Select

Collection/Item Metadata Relationships

Allen H. Renear Karen,M.Wickett,Richard.J.Urban,David Dubin,Sarah L.Shreeves

New Technology of Library and Information Service. 2009, 3(3): 23-29. https://doi.org/10.11925/infotech.1003-3513.2009.03.05

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Contemporary retrieval systems, which search across collections, usually ignore collection-level metadata. Alternative approaches, exploiting collection-level information, will require an understanding of the various kinds of relationships that can obtain between collection-level and item-level metadata. This paper outlines the problem and describes a project that is developing a logic-based framework for classifying collection/item metadata relationships. This framework will support (i) metadata specification developers defining metadata elements, (ii) metadata creators describing objects, and (iii) system designers implementing systems that take advantage of collection-level metadata. We present three examples of collection/item metadata relationship categories, attribute/value-propagation, value-propagation, and value-constraint and show that even in these simple cases a precise formulation requires modal notions in addition to first-order logic. These formulations are related to recent work in information retrieval and ontology evaluation.

Select

The State of the Art in Tag Ontologies: A Semantic Model for Tagging and Folksonomies

Hak Lae Kim, Simon Scerri, John G.Breslin, Stefan Decker, Hong Gee Kim

New Technology of Library and Information Service. 2009, 3(3): 30-37. https://doi.org/10.11925/infotech.1003-3513.2009.03.06

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

There is a growing interest into how we represent and share tagging data in collaborative tagging systems. Conventional tags, meaning freely created tags that are not associated with a structured Ontology, are not naturally suited for collaborative processes, due to linguistic and grammatical variations, as well as human typing errors. Additionally, tags reflect personal views of the world by individual users, and are not normalised for synonymy, morphology or any other mapping.Our view is that the conventional approach provides very limited semantic value for collaboration. Moreover, in cases where there is some semantic value, automatically sharing semantics via computer manipulations is extremely problematic. This paper explores these problems by discussing approaches for collaborative tagging activities at a semantic level, and presenting conceptual models for collaborative tagging activities and folksonomies. We present criteria for the comparison of existing tag Ontologies and discuss their strengths and weaknesses in relation to these criteria.

Select

Semantic Relation Extraction from Socially-generated Tags:A Methodology for Metadata Generation

Miao Chen,Xiaozhong Liu,Jian Qin

New Technology of Library and Information Service. 2009, 3(3): 38-45. https://doi.org/10.11925/infotech.1003-3513.2009.03.07

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

The growing predominance of social semantics in the form of tagging presents the metadata community with both opportunities and challenges as for leveraging this new form of information content representation and for retrieval. One key challenge is the absence of contextual information associated with these tags. This paper presents an experiment working with Flickr tags as an example of utilizing social semantics sources for enriching subject metadata. The procedure included four steps:1) Collecting a sample of Flickr tags, 2) Calculating cooccurrences between tags through mutual information, 3) Tracing contextual information of tag pairs via Google search results,4) Applying natural language processing and machine learning techniques to extract semantic relations between tags. The experiment helped us to build a context sentence collection from the Google search results, which was then processed by natural language processing and machine learning algorithms. This new approach achieved a reasonably good rate of accuracy in assigning semantic relations to tag pairs. This paper also explores the implications of this approach for using social semantics to enrich subject metadata.

Select

Research on the Part-of-Speech Tagging Method

Yin Jinling,Wang Huilin

New Technology of Library and Information Service. 2009, 3(3): 46-51. https://doi.org/10.11925/infotech.1003-3513.2009.03.08

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

POS tagging is an important part of corpora building and a basic research in the field of NLP. After comparing the advantage and weakness of the rule-based methods and the statistical methods, an automatic POS tagging method based on both CRF and TBL is presented. And the tests prove that the method can improve the accuracy of words tagging.

Select

Research and Implementation of Structural XML Retrieval

Liu Dana,Lu Wei,Zhang Mi

New Technology of Library and Information Service. 2009, 3(3): 52-56. https://doi.org/10.11925/infotech.1003-3513.2009.03.09

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper implements structured XML retrieval on a digital library retrieval system WHU-XML, with a method based on inverted file and with NEXI as the retrieval language. The retrieval algorithm and the parsing rules of the query language are also discussed in detail.

Select

Extracting Topic Sentences form Web Text Based on Sentence Relationship Map

He Wei,Wang Yu

New Technology of Library and Information Service. 2009, 3(3): 57-61. https://doi.org/10.11925/infotech.1003-3513.2009.03.10

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Concerning the issues of Web text with little structure information and big noise, sentences are viewed as nodes and similarities between them are viewed as edges, a relationship map is used to describe the relationship between sentences. Topic sentences of a text can be got through searching the nodes which have most of edges. Using the semantic dictionary, sentence similarity is defined as its semantic similarity to address the problem of low word frequency similarity of short text. An internet public campus is chosen to take a test, 80.6% acceptability have been achieved.

Select

Analytic Process and Reconstruction Method of Web Information Chain

Zhang Jun

New Technology of Library and Information Service. 2009, 3(3): 62-68. https://doi.org/10.11925/infotech.1003-3513.2009.03.11

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

It clarifies the five-step analytic process of Web information chain in this article, and elaborates three principles of the information chain reconstruction: the information use of multi-stage and more ways, productivity, incentive compatibility. Referring to the ecological food chain design method, it introduces four reconstruction methods：the ring of increasing the benefit，the ring of decreasing the waste, the ring of recycling the information and the ring of composite function.

Select

Human-Computer Interaction in Network Advertisement Interface Evaluation: A Model of Relationship Between Network Advertisements&rsquo|Interface and Their Effectiveness.

Wang Jiandong

New Technology of Library and Information Service. 2009, 3(3): 69-73. https://doi.org/10.11925/infotech.1003-3513.2009.03.12

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This thesis first analysis the interaction characteristic of network advertisement system, Human-Computer Interaction(HCI) evaluation theory, and the feasibility of applying HCI theory into network advertisement interface evaluation. Then it remodels the traditional usability engineering using advertising effectiveness evaluation theory, and proposes a model to explain the relationship between network advertisements’ interface and their effectiveness. Finally, a series of experiments are designed to verify assumptions of the model.

Select

Online Public Opinion Hotspot Detection and Analysis Based on Document Clustering

Wang Wei,Xu Xin

New Technology of Library and Information Service. 2009, 3(3): 74-79. https://doi.org/10.11925/infotech.1003-3513.2009.03.13

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

According to the requirement of online public opinion analysis, this paper builds an online public opinion hotspot detection and analysis system based on document clustering. It builds vector space model by abstracting document features from sample Web pages, and get the hot-spot cluster by OPTICS algorithm. According the vector of hot-spot cluster, the Web pages are clustered for the second time. At last, it gets the time evolution mode about the public opinion to afford decision support for specific field,and improves the quality of page correlation and analyze the public opinion more accurately.

Select

OAI-PMH with Application to Information Integration in Subject Information Portal Base on Mediawiki System

Tang Yi,Yang Yan

New Technology of Library and Information Service. 2009, 3(3): 80-84. https://doi.org/10.11925/infotech.1003-3513.2009.03.14

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

In order to integrate the resource between subject information gateways, the authors put forward a method which uses OAI-PMH as intermediate metadata for data exchange. The paper first introduces the OAI and OAI-PMH, and then contrasts OAI_DC and metadata of Mediawiki. Finally, analyzes transformation between the two kinds of metadata and demonstrates the result of the transformation.

Select

User-oriented OPAC System----Product Design Based on the Concept of "|Product Lbrary 2.0"

Nie Na,Zhai Xiaojuan

New Technology of Library and Information Service. 2009, 3(3): 85-90. https://doi.org/10.11925/infotech.1003-3513.2009.03.15

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

Based on the current situation of Library2.0 research, which is fanatically studied but lack of substantial production, this paper defines the concept of product Library2.0 and then designs newstyle useroriented OPAC system. It analyzes the design philosophy and modular architecture individually, according to the four characteristics (integration, humanization, individuality and openness) of the system designing. Meanwhile, it also brings forward the practical solvent and related algorithm instance.

Select

Architecture Design and Implementation of English Website of Tsinghua University Library

Yao Fei,Chen Wu,Zhao Yang

New Technology of Library and Information Service. 2009, 3(3): 91-95. https://doi.org/10.11925/infotech.1003-3513.2009.03.16

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

This paper describes the architecture design and implementation of English website of Tsinghua University Library. Tsinghua University Library adoptes the Web content collaboration management of TPI WCCM to develop the back-end management of its English website, and realizes the remote cross-platform release through FTP and reverse proxy technology. The stability, security and reliability of the system are insured, meanwhile the redundancy backup is obtained.

Please choose a citation manager

Content to export

25 March 2009, Volume 25 Issue 3

模态框（Modal）标题

Please choose a citation manager

Content to export

25 March 2009, Volume 25 Issue 3