Semantic linking of multi genre, multi typed, autonomous, and distributed data is always in the center of knowledge organization and discovery, and the concept of linked data provides a light-weight, incremental, scalable and extensible mechanism. Based on systemanic review of the field, the paper describes the four rules and the basic technical framework of linked data, presents the key techniques enabling linked data publishing, linking parsing and browsing, linked data search engines,link update and maintenace. A few typical applications are given and key challenges in applying linked data in a practical domain are also explored.
The information service system that practice the user driven services has three parts: user model, information technologies and service design. The article reviews literatures of personalized interaction design, then discusses using personas and living laboratory to archive the design of user driven personalized digital library system.
This article reviews and sums up the literatures on applied research of Formal Concept Analysis(FCA) and concept lattice theory abroad. It also analyzes the frontier development and research hotspots in four domains, namely study of Ontology, software engineering, knowledge discovery and semantic Web retrieval, which are the most representative and infective characters. In addition, it makes a prospect on the future research.
This paper analyzes the needs of scientific researchers towards cross-domain research results and weaknesses of data mashup model at present, and designs a data mashup model.Then it develops an experiment system to present how data mashup model supports the design and development of cross-domain search system.
This paper proposes a new presentation layer Mashup platform for researchers and research groups: iLibrary,which collects multiple resources, multiple type widget services, constructs scientific and research widget resources library and dynamically service production and consumption mechanism, builds dashboard and portal system for providing services, and supports community and widget service resources management mechanism. The application model, design ideas and realization method of it is also presented.
Based on the introduction of basic collaborative filtering algorithm, six kinds of techniques which are used to ameliorate the scalability problem are generalized, including clustering, probabilistic approach, dimensionality reduction, item-based, dataset reduction and linear model. The collaborative filtering algorithms with aforementioned techniques are commented emphatically, and their ideas are summarized in two points: reducing the neighborhood search space under the precondition of unaffected recommendation quality; periodically running user similarity measuring and neighborhood research offline to reduce the recommendation computation online. Two future research directions on the scalability problem in collaborative filtering are discussed finally, namely the collaborative filtering algorithm based on distributed structure, and the neighborhood search based on formal concept analysis.
This article analyzes the feasibility of latent burst word detection through tracking the clue of energy evolution, and proposes a method based on energy of words and energy evolution trends. First, it describes the life cycle and the evolution progress of words. Then based on the analysis of the energy accumulation and decay and the energy change trend, this article proposes the model evidence and establishes the EneTr model to detect the latent burst words. In addition, it proposes correspond solving method about the key problems of EneTr and implements the algorithm. Finaly, the model is separately validated by experiment on two different document streams which are Web news and scientific literature.
At present,most anonymous privacy-preserving techniques suffer from high information loss and low usability that is mainly due to reliance on pre-defined generalization hierarchies or total order imposed on each attribute domain. Through defining distance and cost function, the paper provides a kind of l-diverse anonymous privacy-preserving model based on clustering algorithm. Experiment results show that the method can improve the usability of the released data while reducing the information loss.
Based on co-occurrence analysis, combining the characteristics of Chinese Medical Subject Headings(CMeSH) and Chinese BioMedical Literature Database(CBM), the paper proposes a new method of constructing Chinese medical concept space, mainly discusses the calculation algorithm for relevance of CMeSH terms co-occurrence. And the test result shows that this method is effective, and it can reduce cognitive burden of CMeSH for non-professional users to some extent.
This paper works on the task of Chinese people name disambiguation by hierarchical clustering algorithm, and proposes several good features for the task by experiments. The authors apply TF to calculate feature weight, and get better results after using artificial rules designed for extracting people name from documents. Finally, an average F-value(α=0.5) of 88.15% is achieved in the test of the corpus containing 191 ambiguous names.
This paper introduces the OAI-ORE’s application in institutional repository and its implementation in DSpace system platform. Firstly, it introduces the OAI-ORE itself, its data model and its application at home and abroad, and analyzes the OAI-ORE’s significance to institutional repository. Secondly, the dissemination and harvesting of the ORE ResourceMap and the resources are implemented.Finally, the authors prospect the future work of OAI-ORE and its influence to the institutional repository.
This paper illustrates Beijing Normal University Library’s practice of integrating self-developed digital resources into OPAC systems by developing integrated interfaces. Based on the integration, the new service system provides “one-stop” access to full-texts for readers.
This paper analyzes novelty search service information management systems or platforms in main Chinese libraries and organizations, and comprehensively describes concept of design, system architecture and service functions of CAS Novelty Search Service Platform. It shows the features of the platform in integration services, reasonable workflow design and user-friendly and so on. Finally, the authors provide some suggestions for improving platform functions.
According to the development of full-text retrieval functions in document management, this paper synthetically uses open source libraries such as Lucene.net, ICTCLAS etc. to establish the document parser and store the parsing documents content in the database.Then Chinese analyzer is established to index the document records, and the full-text retrieval of documents with authority control is realized by indexing the retrieval results and combining with the document control information.
The paper applies tag into library,which is one of the most important technologies for Web2.0. In view of the existing problems in discipline construction for universities, the paper suggests a discipline navigation system based on tag,which is introduced comprehensively from the aspects of design ideas,system function and technology implementation. The system realizes the co-construction and share in discipline resources, and expands the range of tag application in library.