The paper briefly introduces ISO 15511-ISIL:2003 encoding rules and library ID data elements from ISO/FDIS 28560. It discusses the structure of China version ISIL, then proposes some suggests to registration system and ISIL compaction encoding application in ISO/DIS 28560 data elements.
Based on concept lattice theory, this paper attempts to set up a flexible rule mining mechanism through a formal concept analysis and conducts a detailed market segmentation of digital library users’ usage according to the association rules extracted with this mechanism to meet the digital library users’ individual demand to a greater extent.
This paper presents a method to mining the sequential patterns from the user’s retrieval behaviors of digital library based on concept lattice. This method searches out the sequential patterns of the user’s retrieval behaviors by mining ideas of “combining top-down and dividing-and-ruling based on concept lattice”, using the application of the re-usability of concept lattice and its advantage in the extraction of frequent itemset. The method does not require comprehensive scanning to the original user information database, and it greatly reduces the time of mining that can help digital libraries to enhance the user retrieval speed, and improve the personalized services.
This paper proposes the design of Nutch-based Website Harvest and Service system in Special field under the framework of digital library systems integration. It introduces information filtering module, dictionary-based Chinese analyzer module, GUI information module，topic-knowledge based information processing module as well as the Webservice-based search service modules to improve function and performance of the system. It focuses on text parsing filters, plugin development and applications of the level-automatic clustering of the search results. Finally, integration with other subsystem in digital library is realized through the Webservice-interface, which can provide comprehensive and professional services.
This paper concerns on tags of Web collaborative tagging and mainly researches on tag meaning disambiguation methods, which are classified into five types:data mining method, statistical method, knowledge organization tools method, control mechanisms method and visualization components method. The five methods are compared in five aspects of users’ participation, disambiguation occasion, disambiguation property, experiment and application, as well as the development prospect.
Based on the principles and publishing method of linked data, this article introduces and analyses some technique issues of DBpedia. It extracts structured data from Wiki’s free text articles and expresses data in RDF by syntax parsing of WikiText and controlling of workflow. It also provides Web data in many ways such as URI dereference, searching based on SPARQL and RDF dumps.Finally，the paper uses automatic interlinking methods based on schema or properties algorithm to make linkages with a large amount of datasets.
For the amalgamation problems of Ontology and FCA in knowledge modeling, the similarities and differences between Ontology and FCA are compared. Then the conditions of amalgamation of Ontology and FCA are analyzed from the perspectives of philosophy, algebraic structure, knowledge processing and knowledge management, and the amalgamation mechanism of Ontology and FCA in the process of knowledge modeling is defined. Finally，the paper comes to the conclusion that Ontology and FCA can be combined on eight idiographic aspects in the process of knowledge modeling, which may offer a wide view for the amalgamation of Ontology and FCA in the field of knowledge modeling.
Due to the random of mass tagging user-generated tags and non-regularity of confusion arising from the label, this paper introduces the Probabilistic Latent Semantic Analysis （PLSA） algorithm for latent semantic indexing analysis，gets the label set of specific resources under the theme and provides an effective approach for the network information organization and the user’s access. By taking the user annotation information through Delicious site，the paper substantiates that the PLSA approach can achieve a good result for the subjects of particular resources.
This paper introduces the recent advances achieved from five aspects, which include Ajax link elements judgment, page state identification, page state controllable transformation, content extraction and duplicated states detection. The overall processing flow and the relevant supporting technologies are summarized, and the new research trends are discussed. This study will be helpful to promote the further research on Ajax data collection issues.
This paper introduces the concepts and origins of Technology Readiness Levels(TRL) briefly and distinguishes TRL from related concepts, like technology life cycle. Then, much emphasis is put on the widely adopted NASA TRL system, and its characteristics description and research framework, as well as its applied values, assessment tools and limitation of applications are also described. At the end, two methods, including latent semantic indexing and co-word analysis, are carefully examined on the probability of identification of TRLs.
In order to address the limitation of handheld digital devices to display the literature information effectively, this paper designs a system to use the abbreviations to shorten the texts and makes the users read comfortably. Finally,it compares the computation time and compression rate of the articles in different fields.
This paper analyzes the current status of the distribution of biomedical journals in Western Pacific Region, develops the strategy of information collection, and indexes the collected information using concept-based method. Based on MeSH，it also designs the extended query algorithm with the space vector model，then puts the algorithm into the design and implementation of the WPRIM. Finally，the experiment results prove that the above key algorithms can improve the recall of system.
In this paper,an open source software for information extraction called Web-Harvest is detailly introduced firstly.With functional expansion and improvement,a Web information extraction system based on Web-Harvest is designed.The paper focuses on the system design idea and system process,and the design of database tables is also briefly described. Finally,the application of the system is introduced.
In this paper, the model of repairing digital textual material is studied firstly. In order to accurately extract each single character from English textual material, a segmentation method based on projection and characters connectivity is presented. Experimental results show that this method is effective and practical, which can be used to extract each Chinese character because of its extendibility.
This paper introduces the general process and method to develop library toolbars with Conduit platform, which brings library services and resources to the client-side by integrating with existing browsing habits and information retrieval workflows.The toolbars make library services and resources more accessible for users, and also libraries can promote their services.
The article introduces the design and implementation of the monitoring system, and puts emphasis on the design ideas and framework structure. The implementation of monitor information collection, information analysis and real-time alarm by C/C++ and AIX operating system management technologies are also introduced.