Firstly，this article adopts the methods of FCA and researches the process of knowledge integration modeling；Secondly，this article proposes the knowledge integration model based on FCA and gives the example to test the process of knowledge integration modeling；Thirdly ，conclusion shows that introducing FCA in the process of knowledge integration modeling may produce innovative knowledge concepts and improve recall ratio of network resources；It may integrate the heterogeneous concepts and remove homographs so as to improve precision ratio of network resources.
Taking the transforming process of information services to knowledge services in the Web2.0 era as the background，this paper integrates with Semantic Web technologies, analyzes and researches the Semantic Wiki in detail. Then the Semantic Wiki characteristics are also analyzed in-depth, as well as the mechanism of the annotation, navigation mechanism and retrieval mechanism. Finally, the representative Semantic Wiki projects are compared and analyzed.
The article counts and analyzes the internal and external linguistic features of Coordination with Overt Conjunctions （COC） in detail. It mainly investigates the internal linguistic features including the distribution of Part-Of-Speech(POS) and phrases sequences, as well as the external linguistic features including the distribution of syntactic function and the features of border lexicons. For one thing, the statistical data offers the linguistic knowledge for identifying COC, for another thing, the accurate data is used to investigate the COC.
This paper analyzes and summarizes four types of digital library resources, including commercial database resources, Web resources, local self-built resources and OAI metadata harvesting resources, then the authors integrate these resources in different ways. Based on these resources, many services can be supplied. This paper focuses on the analysis and realization of unified search service, whose implementation mechanism and work flow are also introduced. Unified search service can achieve a good user experience by means of searching different resources through data request module, extracting data by data analyzer module, and using Ajax to show the dynamic display results to the users.
This paper reviews the research on XML retrieval from the prospect of information retrieval process. At first, the paper reviews separately on four parts:XML query language, XML indexing, XML retrieval ranking approaches, and evaluation of XML retrieval. Then, a number of XML retrieval hotspots are introduced. At last, it summaries and briefly describes some issues which need further study.
Due to the lack of the integration and analytic function for spatial data in current E-government systems，the article proposes an effective method for transforming spatial data, and integrates digital map, non-spatial attributes, spatial predicates and transaction data in E-government together. Then a multi-dimensions model for integrating spatial data is proposed to realize the transformation of spatial data,which can enrich the analytic functions for E-government.
According to the problem that most information institutions can only provide searching service for literature instead of tables, this paper proposes a table retrieval algorithm which is based on Vector Space Model(VSM).Discussions are implemented from the aspects of table character extraction, term value setting, and search result ranking, which provide theoretical basis of the table retrieval services in the future.
This paper describes the basic process and related technologies of natural language understanding, and it focuses on the Glue Semantics and DRT. The authors designs and develops a semantic computing system based on Glue Semantics and DRT.Then the design ideas, concrete implementation and key technologies, as well as existing problems are introduced in details.
In order to improve the legibility of searching results in current meta-search engines, an intellective meta-search engine framework and a results clustering method based on user behavior learning are set forth in a detailed description. Using this framework and method, the system can assemble the information of user behavior in real-time for reasoning and learning, accumulate the efficient knowledge into knowledge-base for the results cluster managing, adapt itself and be perfect continually as the users’ searching processes. The prototype system proves that the method is feasible and efficacious.
This paper introduces a new method to extract the administrative-domain Ontology term automatically. Firstly, some words that are representative of the candidate terms should be extracted through the technology of word segmentation and the characters merger method. Secondly, the candidate terms are filtered by the way of C-value method and TF-IDF algorithm to achieve the automatic domain-specific term extraction in administrative-domain Ontology. Finally，the experiment shows that this method can improve the accuracy of the extracted terms and do not affect the recall-rate.
For the hot topics found is based on the clustering algorithm，this paper introduces the improved ant colony clustering algorithm，and raises Class Attention Degree (CAD) concept in order to determine the class of hot level and to distinguish popular categories as well as unpopular categories. Meanwhile，hot topic set is erxtracted on this basis. Experimental results show that the improved ant colony clustering algorithm has in certain effects to the hot topics found.
Firstly, the defects of method based on mutual information in the feature selection are analyzed theoretically，then an improved method is put forward. According to the problems of vector space model, the authors use a class space model to express text and take advantage of the category information. In this way, the paper realizes an algorithm of text categorization based on category，and the result based on the Chinese text categorization shows that this method has a better precision in the text categorization.
The paper presents related standards about thesaurus development and focuses on analyzing the thesaurus data model under the standard of ISO 25964. Based on the data model, the paper designs thesaurus development system model and implements the key functions. The system changes traditional working mode of thesaurus development. It can satisfy the requirement in network environment and make data processing, updating and maintenance more convenient.
The paper firstly summarizes the current application of RSS technology in library. According to the problems of key subjects information construction in universities, the author presents an idea of combining the information of key subjects in universities with library information push service. This paper introduces the approach on key subjects information push service system based on RSS technology from the aspects of design methodology, system function, and technology implementation. The system expands the form of RSS application in library, and improves the utilization rate of library resources.
The paper develops a Chinese academic bibliometrics software based on the open source software such as Lucene, IKAnalyzer and Luke, introduces the implementation framework of this software, the work on how to prepare the data, the key codes in indexing and the analyzer named as SemicolonAnalyzer which is designed by the author. It also analyzes the different bibliometrics results of the software. The goal of the practice is to lower the complexities of coding by the open source software and to provide a feasible method of developing Chinese academic bibliometrics software.
This paper introduces the development of the search plug-in for electronic resource of library based on OpenSearch protocol, which puts library resources to the search bar of user’s browser by integrating with existing browsing habits and information retrieval workflows, and makes the electronic resource more accessible and usable.