This paper gives a comprehensive description of the open source software of digital library in the abroad, including the significant improvement and expansion on open source software systems, the integration of multiple open source software systems and the integration of open source software and other technologies.
This paper briefly surveys the state-of-the-art of construction and evolution of domain Ontology. It describes the process to construct a primary version of economics Ontology from existing Chinese classified thesaurus, and the approach to evolve the primary version of the domain Ontology. The key techniques of Ontology evolution include creating a dataset for Ontology learning, determining the candidate keywords, and discovering the concepts and relationship of the domain Ontology.
Based on the introduction of the open source DSpace software, many useful steps are introduced to accord with the really application requirement of Internal correlative institution and to adapt for the Internal user’s usage habit. The steps includes several system files of DSpace about language be Chinese localized, the system interfaces be adjusted, the system functions be optimized, and the mail server’s function be improved. All the steps are carried out on Xiamen University Institutional Repository which has been built with DSpace.
This paper puts forward the framework and functions of the OAI-PMH based interoperation of digital archives and after one of the main digital archival metadatas EAD and its mapping with DC, which is generally supported by OAI-PMH, are introduced, the technical principles on how EAD can be shifted into DC, and particularly, how the context information between EAD subordinate components can be kept after being shifted into OAI records, are discussed. At last, the existing problems in the process are analyzed, and some solutions are advanced.
According to the needs of personalized recommendation service and the problem of high-dimension and sparse user-document visited data, an inter-user comparation based dimension reduction method and K-hirachical clustering arithmetic is utilized to analyze the user clustering procedure based on users’ resources evaluation data colloction. On the basis of those, an experimental system of user clustering is also designed and developed by applying Java open source technology.
Interoperability among Knowledge Organization Systems(KOS) is one of the key technology to cross-browsing and searching. The paper introduces some research projects about interoperability among KOS, summarizes the methods adopted, analyzes three examples, and puts forward some suggestions to realizing interoperability among KOS in our country.
This paper studies some issues related with intelligent information retrieval. Firstly, the method for calculating semantic similarity and relativity by use of taxonomy and entailment relations of Ontology is proposed, by which query expansion can be implemented. Secondly, by use of the relations in Ontology, keywords queries are standardized and re-construct in the form of RDF. Finally, the scheme is proved reasonable and valid by concrete tests and analysis.
The paper puts forward an approach of Ontology-based rule classification. Firstly, the approach creates Ontology for each subclass of the classification system. Then, the Web texts classification is performed using the rules and Ontology. Comparing with the method of Rocchio, the results of the experiments indicate that the recall of Ontology-based approach is slightly lower than Rocchio’s, but its precision is more eminent than Rocchio’s.
Based on Vector Space Model(VSM) and Nave-Bayes(NB), completed a multilayer and multi-classification text categorization system. Introduce detailedly four modules: words’ segmentation and frequency statistics, calculating between classifications’ and document, emendating the veracity of parent-class by emendation of subclass, judging whether document has multi-classification and multi-label. Text representation based on Vector Space Model has 89.7% MicroF1 of parent- category, 77.8% of sub- category; text representation based on Nave-Bayes has 67.6% MicroF1 of parent- category, 66.5% of sub- category.
This paper firstly generalizes the formats of Chinese time words and numerals appearing in the text. Based on them, this paper then sets up a rule sets for recognition, proposes a method about Chinese time words and numerals based on rules and discusses its application value in competitive intelligence analysis as well as machine translation field at last.
Both Formal Concept Analysis (FCA) and domain Ontologies are two kinds of knowledge representations formalisms and their aims are at modeling concepts. This paper proposes a method to compute the similarity between concepts in FCA. The experimental result shows this method is effective for concept similarity computation.
A method to construct semantic distribution dictionary based on WordNet is presented in this paper. After introducing WordNet system and semcor corpus, the structure of semantic distribution dictionary is designed. The contents of sense.idx file and taglist file are analyzed, and the procedure for constructing semantic distribution dictionary based on them is described in detail.
Based on the introduction of information architecture and usability,this paper analyses the relationship between the two concept, we investigate 16 public library Websites at home and tentatively develop an evaluation system which is suitable for the public library Website, then this paper uses the evaluation system to test and evaluate the usability of IA of the Shanghai public library Website.
This paper summarizes the definition and characteristics of Ajax and RSS,and mainly explains their applications in construction of the personalized portal site of Tsinghua University Library.
Process of statistic is designed in accordance with character of Web data after analyzing them. Each stage is experimented with some different algorithms in order to achieve optimal solution. According to experiment, efficiency and effectiveness can be improved by decreasing IO operation, increasing process granularity and using lexicon.
In view of the current OPAC (Online Public Access Cataloglle) searching engines lack the proprietary operating system environment, put forward an OPAC machine solution using of open source software. Being different from the traditional machines based on Windows solutions, the OPAC machine uses free and open source software to construct an open standard, which can reduce investment, and is more efficient, stable, safe, and easy to maintain.
This paper presentes a new method of music melody extraction, which discovers note’s boundary and segments note by sequentially scans the pitch list and detects the pitch movement. Around 1 000 pieces of electronic-keyboard-played Chinese folk music have been processed, the success rate is over 90%.
Aimming at the questions such as different data structure, inconsistent data format and nonstandard of data in the course of dissertations databases construction, and associating with experience of merging two doctoral & master degree’s dissertations databases of TRS with different data structure in our library, the paper introduces how to resolve such questions with VBA in Word and presents actual program code.
This paper introduces the principle and characteristic of fingerprint-technology, designs fingerprint-technology of reader credential system, discusses the cost question, privacy question, fingerprint-gathering and comparing question in the library for fingerprint-technology.
Through describing and analyzing the application of the ALEPH 500, this paper gives an introduction to its main functions and characteristics as well as its localization development in Sichuan University Library. In addition, the paper provides profound research and discussion to its present existent problems.
This paper mainly uses content management technology,proposes a design solution of university students archive management system, which aims to bring convinence for remote usage.Moreover this paper introduces how to implement a prototype system with the IBM content manager v8.3 which is a midware product of IBM corp.