Home Browse Highlights

Highlights

Please wait a minute...
  • Select all
    |
  • He Guoxiu, Ren Jiayu, Li Zongyao, Lin Chenxi, Yu Haiyan
    Data Analysis and Knowledge Discovery. 2024, 8(4): 1-13. https://doi.org/10.11925/infotech.2096-3467.2023.0684
    Abstract (216) PDF (154) HTML (45)   Knowledge map   Save

    [Objective] This study explores whether content-based deep detection models can identify the semantics of rumors. [Methods] First, we use the BERT model to identify the key features of rumors from benchmark datasets in Chinese and English. Then, we utilized two interpretable tools, LIME, based on local surrogate models, and SHAP, based on cooperative game theory, to analyze whether these features can reflect the nature of rumors. [Results] The key features calculated by the interpretable tools on different models and datasets showed significant differences, and it is challenging to decide the semantic relationship between the features and rumors. [Limitations] The datasets and models examined in this study need to be expanded. [Conclusion] Deep learning-based rumor detection models only work with the features of the training set and lack sufficient generalization and interpretability for diverse real-world scenarios.

  • Qi Xiaoying, Li Hanyu, Yang Haiping
    Data Analysis and Knowledge Discovery. 2024, 8(4): 76-87. https://doi.org/10.11925/infotech.2096-3467.2023.0081
    Abstract (138) PDF (250) HTML (36)   Knowledge map   Save

    [Objective] This paper aims to achieve multi-semantic classification of maps and meet the needs for precise map retrieval and intelligence analysis. [Methods] We designed a map category system and proposed a multi-label map classification strategy. It realized the automatic classification of South China Sea maps based on the AlexNet convolution neural network classification model. [Results] The F1 value of the proposed model is 0.979. This model can effectively realize the multi-label automatic classification of the South China Sea maps. [Limitations] The deep categories of multi-label annotated datasets need to be supplemented. [Conclusions] This paper provides a reference for the semantic-based scientific classification of maps, precise retrieval, and cross-category association.

  • Huang Taifeng, Ma Jing
    Data Analysis and Knowledge Discovery. 2024, 8(3): 77-84. https://doi.org/10.11925/infotech.2096-3467.2023.0004
    Abstract (329) PDF (308) HTML (50)   Knowledge map   Save

    [Objective] This paper aims to improve the low accuracy of sentiment classification using the pre-trained model with insufficient samples.[Methods] We proposed a prompt learning enhanced sentiment classification algorithm Pe(prompt ensemble)-RoBERTa. It modified the RoBERTa model with integrated prompts different from the traditional fine-tuning methods. The new model could understand the downstream tasks and extract the text’s sentiment features. [Results] We examined the model on several publicly accessible Chinese and English datasets. The average sentiment classification accuracy of the model reached 93.2% with fewer samples. Compared with fine-tuned and discrete prompts, our new model’s accuracy improved by 13.8% and 8.1%, respectively. [Limitations] The proposed model only processes texts for the sentiment dichotomization tasks. It did not involve the more fine-grained sentiment classification tasks. [Conclusions] The Pe-RoBERTa model can extract text sentiment features and achieve high accuracy in sentiment classification tasks.

  • Li Xuesi, Zhang Zhixiong, Wang Yufei, Liu Yi
    Data Analysis and Knowledge Discovery. 2024, 8(1): 1-15. https://doi.org/10.11925/infotech.2096-3467.2023.1280
    Abstract (341) PDF (1728) HTML (43)   Knowledge map   Save

    [Objective] Domain knowledge evolution analysis has been a long-standing research topic in the field of Library and Information Science. This paper provides a comprehensive review of the research methods related to the domain knowledge evolution analysis, both nationally and internationally, aiming to offer valuable references for future studies in this area. [Coverage] We conducted searches in CNKI and Web of Science using keywords related to domain knowledge evolution. The search results were manually evaluated and analyzed, and a total of 84 key literatures closely related to the methods of domain knowledge evolution analysis were selected for review. [Methods] By reviewing the research literature, we clarified the relevant concepts of domain knowledge evolution. Based on this, we classified the existing domain knowledge evolution analysis methods into three categories: citation-based, structure-based and content-based. For each category, we first elucidated the theoretical basis, then explained their basic analytical frameworks and highlighted the relevant advances. Finally, we summarized the existing methods of domain knowledge evolution analysis and provided perspectives. [Results] The three categories of existing methods for domain knowledge evolution analysis rely on their respective scientific theories. With the advancement of technology and the improvement of data resources, these methods are continuously deepening and improving the analytical framework for the study of evolution. Although significant research achievements have been made, there has been no breakthrough in the research perspective of knowledge evolution analysis, and the limitations within the current research paradigm remain unresolved. [Limitations] The review analysis was based on selected literature, which may not have comprehensively covered all relevant research. [Conclusions] Based on the summary and analysis of the current research, we believe that the following two directions are worth focusing on in the future research on domain knowledge evolution analysis: first, exploring new entry points for domain knowledge evolution analysis, and second, attempting to integrate existing research methods to improve the limitations of current analytical approaches.

  • Fu Yun, Zhu Liya, Li Dan, Sun Mengge, Zhang Jianfeng, Liu Xiwen
    Data Analysis and Knowledge Discovery. 2024, 8(1): 30-39. https://doi.org/10.11925/infotech.2096-3467.2023.0867
    Abstract (215) PDF (2110) HTML (27)   Knowledge map   Save

    [Objective] This study addresses the unified representation issue of experimental operation verbs in synthetic experiment protocols, which provides high-quality experimental protocol data for science intelligence and robotics. [Methods] We utilized a collaborative approach driven by data and expert knowledge to identify and standardize experimental operation verbs from literature and patent texts related to synthesis. First, we used advanced open-source large models like ChatGLM2-6B to identify experimental operation verbs. Then, we combined Wu-Palmer and cosine similarity to standardize these verbs. Finally, we assessed their classification accuracy with expert knowledge. [Results] The study identified 149 operation verbs for inorganic synthetic experiments and 141 operation verbs for organic synthetic experiments. Expert judgment revealed that many of the 124 operation terms appearing in both groups do not possess distinct category characteristics. Therefore, we merged the two categories to have 166 experimental operation verbs representing the operations in organic, inorganic, and hybrid synthesis experiments. [Limitations] The study only employed basic prompt engineering techniques to direct the large model to recognize experimental operation verbs from publicly accessible datasets. This study focused on operation terms involved in synthesis, engineering, and basic steps without considering operation terms in dynamic, analytical, and name reactions. [Conclusions] This study establishes a unified language for representing experimental operations in synthesis, applicable to organic, inorganic, and hybrid synthesis reactions. It could inform the future development of scientific robotics experiments.

  • Bao Tong, Zhang Chengzhi
    Data Analysis and Knowledge Discovery. 2023, 7(9): 1-11. https://doi.org/10.11925/infotech.2096-3467.2023.0473
    Abstract (1569) PDF (936) HTML (152)   Knowledge map   Save

    [Objective] This paper evaluates the performance of typical Chinese information extraction tasks such as named entity recognition, relationship extraction, and event extraction with ChatGPT. It also analyzes the performance differences of ChatGPT in different tasks and domains, which provides recommendations for ChatGPT in Chinese contexts. [Methods] We used manual prompts to evaluate the test results with exact matching or loose matching on three typical information extraction tasks across seven datasets. We evaluated the named entity recognition of ChatGPT on MSRA, Weibo, Resume, and CCKS2019 datasets and compared it with GlyceBERT and ERNIE3.0 models. We extracted the relationships with ChatGPT and ERNIE3.0 Titan on FinRE and SanWen datasets. We ran the event extraction of ChatGPT and ERNIE3.0 on the CCKS2020 dataset. [Results] In the named entity recognition task, ChatGPT was outperformed by GlyceBERT and ERNIE3.0 models. ERNIE3.0 Titan was also superior to ChatGPT significantly in the relationship extraction task. In the event extraction task, ChatGPT’s performance was slightly better than ERNIE3.0 under loose matching. [Limitations] The evaluation of ChatGPT’s performance using prompts is subjective, and different prompts may lead to different results. [Conclusions] ChatGPT needs to improve its performance on typical Chinese information extraction tasks, and users should choose appropriate prompts for better results.

  • Orginal Article
    Zhu Peng,Zhao Xiaoxiao,Wu Wei
    Data Analysis and Knowledge Discovery. 2017, 1(3): 1-9. https://doi.org/10.11925/infotech.2096-3467.2017.03.01
    Abstract (1959) PDF (1174) HTML (39)   Knowledge map   Save

    [Objective] This paper tries to explore the impacts of motivation, product types, and marketing strategy, as well as their interactions on the shopping preferences of mobile e-commerce consumer’s. [Methods] We used scene-based questionnaire to collect the needed data. [Results] We found that the interaction between product types and marketing strategy posed significant effects to mobile e-commerce consumer’s purchase preferences. [Limitations] We did not include other influencing factors such as product involvement, individual cognitive demand and perceived risks in this study. [Conclusions] This paper provides advice to mobile E-commerce product vendors from the perspectives of consumers, products and marketing strategies.

  • Orginal Article
    Ye Guanghui,Xia Lixin
    Data Analysis and Knowledge Discovery. 2017, 1(2): 1-10. https://doi.org/10.11925/infotech.2096-3467.2017.02.01
    Abstract (2725) PDF (850) HTML (49)   Knowledge map   Save

    [Objective] This paper reviews the expert retrieval and expert ranking literature to provide theoretical foundations for future studies. [Coverage] 65 papers were retrieved from the Web of Science (WOS), CNKI and other databases using the keywords of “expert retrieval”, “expert ranking”, and “ranking fusion”. [Methods] We analyzed research evaluating expert retrieval and fusion rankings, aiming to solve the issues of insufficiency of expert coverage and heavy computation of expert features. [Results] We found that most expert retrieval system adopted the relationship attribute fusion method, and the credibility of search results was decided by the users’ satisfaction and quality of the retrieved documents. Expert ranking was established by FRM, PageRank, D-S theory, social network and complex network analysis. Empirical research showed that the fusion ranking results were generally better than the baseline ones. [Limitations] More comparison of research among different ranking methods was needed. [Conclusions] Related studies help us building expert consulting platform from the perspective of expert information organization, expert selection and expert opinion fusion.

  • Orginal Article
    New Technology of Library and Information Service. 2016, 32(1): 1-2. https://doi.org/10.11925/infotech.1003-3513.2016.01.01
    Abstract (255) PDF (2327) HTML (99)   Knowledge map   Save
  • Orginal Article
    Feng Liu, Xiaolin Zhang
    New Technology of Library and Information Service. 2016, 32(1): 11-16. https://doi.org/10.11925/infotech.1003-3513.2016.01.03
    Abstract (373) PDF (343) HTML (85)   Knowledge map   Save

    [Objective] Propose a set of detailed structure specifications of scientific data management plan and in accordance with a data curation model constructed from the operational perspective. [Methods] This paper carries on the research and the statistics on the scientific data management plan specification of the main research and management agencies in the world, and makes supplement combining with the requirement and characteristic of current scientific research data management. [Results] This paper forms the detailed structure specification of data management plan with 8 major basic elements and 39 sub-elements and constructs a data curation model taking data management plan as the core driver. [Conclusions] The detailed structure specification of data management plan may regulate and guide the activities of scientific data management completely and accurately, it can also be effectively controlled and restricted the data curation process of the whole life cycle of scientific research at the operational level.

  • Orginal Article
    Heng Ding, Wei Lu
    New Technology of Library and Information Service. 2016, 32(1): 17-23. https://doi.org/10.11925/infotech.1003-3513.2016.01.04
    Abstract (458) PDF (1886) HTML (100)   Knowledge map   Save

    [Objective] Summarize the fundamental strategies and core issues in Cross-Modal Information Retrieval (CMIR) based on correlation, and do research about the pros and cons of using partial least squares in feature subspace projection in order to improve retrieval effect. [Methods] Based on Wikipedia CMIR dataset, LDA and BOW models are used as a characteristic expression of text and image resources, cosine distance as the similarity measure, and the least squares method is used to learn subspace projection function replacing canonical correlation analysis method. [Results] Using comparative analysis of the influence of three features subspace projection methods named canonical correlation analysis, partial least squares regression, partial least squares correlation on CMIR results according to three retrieval evaluation indicators that are P@K, MAP and NDCG, and the results show that partial least squares correlation obtains the best results. [Limitations]In dealing with data, partial least squares method assumes a linear relationship between the data and an orthogonal relationship between the data base vectors, therefore the non-linear, non-orthogonal problem can not be solved. [Conclusions] Feature subspace projection learning by using partial least squares correlation is more consistent with original spatial information, and CMIR results are more stable.

  • article
    Zhou Ning He Jian
    New Technology of Library and Information Service. 2010, 26(7/8): 3-8. https://doi.org/10.11925/infotech.1003-3513.2010.07-08.02
    Abstract (1176) PDF (725) HTML (9)   Knowledge map   Save

    This article mainly discusses the theoretical methods and technologies of information visualization prototype system. It specifically discusses constructing strategy of visualization model, environmental configuration, functional module and operation method of prototype system. What’s more, it also discusses construction of text, voice (audio) and image information visualization model, data preparation and data scale, operation interface and running result. The research on this prototype system is not only a helpful attempt in studying visualization model of general information resources management with certain experience achieved, but also a successful explore in terms of visualization of Chinese information.

  • article
    Wu Jiaxin Wang Jianhai
    New Technology of Library and Information Service. 2010, 26(7/8): 9-14. https://doi.org/10.11925/infotech.1003-3513.2010.07-08.03
    Abstract (1731) PDF (1958) HTML (17)   Knowledge map   Save

    Based on situation awareness theory, this paper discusses the relationships between situation awareness and visualization, then a visualization awareness model is constructed. The model consists of 5 stages, including situation awareness requirements analysis, extraction of data and knowledge, situation visualization & interaction, situation awareness, and decision making & enforcement. At last, the key problems of visualization awareness are presented.

  • article
    Xu Jian Zhang Zhixiong Xiao Zhuo Deng Zhaojun
    New Technology of Library and Information Service. 2010, 26(7/8): 51-57. https://doi.org/10.11925/infotech.1003-3513.2010.07-08.10
    Abstract (1345) PDF (1073) HTML (8)   Knowledge map   Save

    Based on the analysis of recent related literatures and projects, the paper concludes the term semantic measure methods as follows: similarity measure methods based on corpus characters and similarity measure methods based on open knowledge resources. And then it reviews the integration methods of multi-measure methods. It also summarizes the applications of term semantic similarity measure methods on the area of Natural Language Process (NLP) and Knowledge Mining (KM). Finally, the future development of research on term similarity measure is discussed to help build more efficient term similarity calculation system.

  • article
    Zeng Xinhong Huang Huajun Lin Weiming
    New Technology of Library and Information Service. 2010, 26(7/8): 58-65. https://doi.org/10.11925/infotech.1003-3513.2010.07-08.11
    Abstract (1342) PDF (806) HTML (9)   Knowledge map   Save

    This paper makes a research on the implementation of network-based retrieval and reasoning about Ultra-Large-Scale OntoThesaurus, and the proposed solution has successfully applied to the realization of the CCT1_OTCSS, which is a co-construction and sharing system of an Ultra-Large-Scale Ontology named CCT1_OntoThesaurus. This paper proposes the structure of Lucene index based on the idea of triple “subject, predicate, object” of the RDF, and validates the feasibility of implementing efficient retrieval, terminology services and reasoning based on the Lucene index of Ultra-Large-Scale OntoThesaurus. The solution can be reused for several Ultra-Large-Scale Chinese thesauri most widely used in China at present, implementing quickly Ontology-oriented upgrading, networked co-construction, sharing and dynamic updating for them, and also has a reference value for other large-scale knowledge organization systems (thesauri, Ontology, etc.) in the form of XML, RDF or OWL at home and abroad.

  • article
    Dong Xijing
    New Technology of Library and Information Service. 2010, 26(3): 1-7. https://doi.org/10.11925/infotech.1003-3513.2010.03.01
    Abstract (1175) PDF (849) HTML (5)   Knowledge map   Save

    The paper briefly introduces ISO 15511-ISIL:2003 encoding rules and library ID data elements from ISO/FDIS 28560. It discusses the structure of China version ISIL, then proposes some suggests to registration system and ISIL compaction encoding application in ISO/DIS 28560 data elements.

  • article
    Teng Guangqing,Bi Qiang
    New Technology of Library and Information Service. 2010, 26(3): 8-12. https://doi.org/10.11925/infotech.1003-3513.2010.03.02
    Abstract (1259) PDF (614) HTML (7)   Knowledge map   Save

    Based on concept lattice theory, this paper attempts to set up a flexible rule mining mechanism through a formal concept analysis and conducts a detailed market segmentation of digital library users’ usage according to the association rules extracted with this mechanism to meet the digital library users’ individual demand to a greater extent.

  • article
    Chang Zhirong,Ma Ziwei,Li Gaohu
    New Technology of Library and Information Service. 2010, 26(3): 19-26. https://doi.org/10.11925/infotech.1003-3513.2010.03.04
    Abstract (1411) PDF (737) HTML (6)   Knowledge map   Save

    This paper proposes the design of Nutch-based Website Harvest and Service system in Special field under the framework of digital library systems integration. It introduces information filtering module, dictionary-based Chinese analyzer module, GUI information module,topic-knowledge based information processing module as well as the Webservice-based search service modules to improve function and performance of the system. It focuses on text parsing filters, plugin development and applications of the level-automatic clustering of the search results. Finally, integration with other subsystem in digital library is realized through the Webservice-interface, which can provide comprehensive and professional services.

  • article
    Dou Yumeng
    New Technology of Library and Information Service. 2010, 26(3): 27-32. https://doi.org/10.11925/infotech.1003-3513.2010.03.05
    Abstract (1158) PDF (690) HTML (7)   Knowledge map   Save

    This paper concerns on tags of Web collaborative tagging and mainly researches on tag meaning disambiguation methods, which are classified into five types:data mining method, statistical method, knowledge organization tools method, control mechanisms method and visualization components method. The five methods are compared in five aspects of users’ participation, disambiguation occasion, disambiguation property, experiment and application, as well as the development prospect.

  • article
    Bai Haiyan
    New Technology of Library and Information Service. 2010, 26(3): 33-39. https://doi.org/10.11925/infotech.1003-3513.2010.03.06
    Abstract (1655) PDF (1480) HTML (3)   Knowledge map   Save

    Based on the principles and publishing method of linked data, this article introduces and analyses some technique issues of DBpedia. It extracts structured data from Wiki’s free text articles and expresses data in RDF by syntax parsing of WikiText and controlling of workflow. It also provides Web data in many ways such as URI dereference, searching based on SPARQL and RDF dumps.Finally,the paper uses automatic interlinking methods based on schema or properties algorithm to make linkages with a large amount of datasets.

  • article
    Guo Wenli,Zhang Xiaolin
    New Technology of Library and Information Service. 2010, 26(2): 1-6. https://doi.org/10.11925/infotech.1003-3513.2010.02.01
    Abstract (1323) PDF (642) HTML (11)   Knowledge map   Save

    To satisfy the various Ontology demands from different perspectives and hierarchies, this paper proposes a new description of Ontology modules based on granularity theory, by which the users can easily extract the modules from existing large Ontologies. Through combining granular computing and faceted classification, this paper defines the granular properties of the Ontology and gives the definition of Ontology granular partitioning as well as its semantic explanation.

  • article
    Teng Guangqing,Bi Qiang
    New Technology of Library and Information Service. 2010, 26(2): 7-11. https://doi.org/10.11925/infotech.1003-3513.2010.02.02
    Abstract (1690) PDF (670) HTML (13)   Knowledge map   Save

    Based on concept lattice theory and drawing supports from market segmentation variables of marketing, this article develops market segment of digital library users by means of conceptual clustering of formal concept analysis. The authors also investigate the construction of elastic segmentation mechanism by breakthrough of traditional statistics in digital library users.

  • article
    Yao Fei,Jiang Airong
    New Technology of Library and Information Service. 2010, 26(2): 12-16. https://doi.org/10.11925/infotech.1003-3513.2010.02.03
    Abstract (1295) PDF (532) HTML (10)   Knowledge map   Save

    This paper briefly introduces the basic summary of the Planets project, and makes a detailed description of the preservation planning, content characterisation, preservation action, interoperability framework as well as testbed of the project.The authors believe that Planets provides various tools and services needed during the long-term preservation, facilitates the development of the long-term preservation of digital resources, and it’s a good reference for China.

  • article
    Zhang Yunzhong,Xu Baoxiang
    New Technology of Library and Information Service. 2010, 26(2): 17-23. https://doi.org/10.11925/infotech.1003-3513.2010.02.04
    Abstract (1397) PDF (593) HTML (10)   Knowledge map   Save

    For the problems of how to improve information systems modeling theory based on FCA, the core issues which information system modeling theory should solve have been defined, and the advantages of using FCA to resolve these issues have also been analyzed, then the application direction of FCA in the process of information system modeling is pointed out. Eventually, the paper presents a FCA-based information system modeling theory, and then elaborates the method of dividing system into subsystems and the principle of how to construct static model, dynamic model and functional model using FCA with examples.

  • article
    Bai Haiyan,Zhu Lijun
    New Technology of Library and Information Service. 2010, 26(2): 44-49. https://doi.org/10.11925/infotech.1003-3513.2010.02.08
    Abstract (1450) PDF (1326) HTML (8)   Knowledge map   Save

    This paper introduces three automatic interlinking approaches,including mapping based on entity’s text, graph’s similarity and rules. Mapping based on entity’s text is the basic approach and mapping based on graph’s similarity is an extension of single triple comparing. These two approaches are general and common methods,but the relationship types they can create are very limited. The approach of interlinking based on rules can create richer and more complex relationship types, but it depends on specific data models and related rules.

  • article
    Zhang Chengzhi
    New Technology of Library and Information Service. 2009, 3(2): 1-8. https://doi.org/10.11925/infotech.1003-3513.2009.02.01
    Abstract (1878) PDF (1738) HTML (10)   Knowledge map   Save

    The research background and related research work about Document Clustering Description (DCD) are given in this paper. The relationship between DCD and automatic indexing, automatic summarization, conceptual clustering is explained and the research content of DCD is definited. According to its requirements, the tasks of DCD are formalized. The evaluation methods of DCD are also described in this paper.

  • article
    Dou Yumeng,Zhao Danqun
    New Technology of Library and Information Service. 2009, 3(2): 9-17. https://doi.org/10.11925/infotech.1003-3513.2009.02.02
    Abstract (2081) PDF (1219) HTML (13)   Knowledge map   Save

    This paper concerns on collaborative tagging system, and makes a review on the articles of this field from three levels: research on theories, empirical studies and research on experiments and applications. Finally, the work done in this article is summarized,and the future development of research on collaborative tagging system is discussed.

  • article
    Lu Shengjun,Li Fayong,Qian Jianjun ,Zhen Zhen
    New Technology of Library and Information Service. 2009, 3(2): 18-22. https://doi.org/10.11925/infotech.1003-3513.2009.02.03
    Abstract (1750) PDF (950) HTML (9)   Knowledge map   Save

    This paper introduces an Ontology integration approach WCONS+ which consists of preparation, mapping, integration and checking phases. An experiment is given by integrating of military aircraft Ontology and electronic warfare equipment Ontology, and the results show the effectiveness of WCONS+.

  • article
    Jiang Caihong,Qiao Xiaodong ,Zhu Lijun
    New Technology of Library and Information Service. 2009, 3(2): 23-28. https://doi.org/10.11925/infotech.1003-3513.2009.02.04
    Abstract (1824) PDF (1065) HTML (13)   Knowledge map   Save

    This paper analyzes Chinese patent abstract about alternative energy vehicles by way of knowledge engineering method, and puts forward an Ontology-based knowledge extraction model for Chinese patent abstracts. Main stages in building the model include: to construct a corresponding Ontology, to collect a related word list, to write corresponding rules. These rules are utilized to extract underlying knowledge in patent abstracts. The result aids in the automatic construction of patent knowledge base. This paper is an attempt on how to organize unstructed information and on how to automatically construct a knowledge base, and verifies the feasibility of Ontology-based patent abstracts' knowledge extraction.

  • article
    Li hua,Wu Zhenxin,Guo Jiayi,Xiang Jing
    New Technology of Library and Information Service. 2009, 3(1): 2-9. https://doi.org/10.11925/infotech.1003-3513.2009.01.02
    Abstract (1563) PDF (1135) HTML (14)   Knowledge map   Save

    This paper reviews the development history of Web Archive, analyses the progress and characteristics of the three different stages of the initial experiment, deployment and application of Web Archive. By summing up researches of Web Archive and international practice in recent years, the authors give an initial view for the future development trend of Web Archive, hope to be a valuable reference for Chinese Web archive researches.

  • article
    Liu Lan,Wu Zhenxin,Zhang Zhixiong,Xu Lin
    New Technology of Library and Information Service. 2009, 3(1): 10-15. https://doi.org/10.11925/infotech.1003-3513.2009.01.03
    Abstract (1365) PDF (763) HTML (14)   Knowledge map   Save

    This paper summarizes three commonly used harvest strategies in Web Archive:the integrity harvest, selective harvest and hybrid harvest. Then comparatively analyzes characteristics of various harvest strategies, key issues and representation projects. Finally, some key factors need to consider in choosing the harvest strategy are analyzed and general recommendations are made.

  • article
    Wu Zhenxin,Zhang Zhixiong,Sun Zhiru
    New Technology of Library and Information Service. 2009, 3(1): 28-33. https://doi.org/10.11925/infotech.1003-3513.2009.01.06
    Abstract (1427) PDF (742) HTML (14)   Knowledge map   Save

     This article introduced current applications of web archive resources, and then from the perspective of data mining, analyzes and sums up the in-depth applications of web archive resources.

  • article
    Li Feng,Li Chunwang
    New Technology of Library and Information Service. 2009, 3(1): 44-49. https://doi.org/10.11925/infotech.1003-3513.2009.01.07
    Abstract (1697) PDF (2368) HTML (11)   Knowledge map   Save

    This paper induces and discusses the Mashup technology, including system framework, resource acquiring technology, representation component technology, server technology and merging technology. The resource acquiring technology includes Web Feed, API, REST protocol and screen scraping .The representation component is classified into Portlet and Widget. The server technology is illustrated by Kapow Mashup Server. The merging technology puts an emphasis on merging schema, programming language and Mashup tools.Finally, it points out the existing problem and the research direction in future.

  • article
    Zeng Su,Ma Jianxia,Tang Tianbo,Han Ke
    New Technology of Library and Information Service. 2009, 3(1): 50-57. https://doi.org/10.11925/infotech.1003-3513.2009.01.08
    Abstract (1543) PDF (938) HTML (15)   Knowledge map   Save

    Based on the survey of researchers, librarians and decision makers in the chinese academy of sciences and some domestic universities, the authors compare the difference of cognition and requirement of IR between different roles. The paper elaborates the problems of planning and construction of IR in China to provide reference for the implementation of IR in domestic scientific research institutions and universities.

  • article
    Xu Jian,Zhang Zhixiong
    New Technology of Library and Information Service. 2009, 25(4): 1-6. https://doi.org/10.11925/infotech.1003-3513.2009.04.01
    Abstract (1832) PDF (1680) HTML (3)   Knowledge map   Save

    The paper analyzes typical open source Web crawl software, such as Nutch, Heritrix, WCT, and Web-Harvest. Following the analyzed result, it puts forward a targeted websites harvest system based on Nutch. Four key issues of this system are discussed emphatically, which are the initial seed websites selection, the harvest process management, the web page content denoising, and discovering of new seed websites.

  • article
    Bai Haiyan,Jiang Bo
    New Technology of Library and Information Service. 2009, 25(4): 7-13. https://doi.org/10.11925/infotech.1003-3513.2009.04.02
    Abstract (2121) PDF (982) HTML (3)   Knowledge map   Save

    This article analyzes the levels and structure of knowledge organization system in digital library, emphasizes on four components -KOS building and management, KOS interoperation, KOS storage and administration, semantic metadata generation.Related open source software is chosen and application of each structure in the process of digital library knowledge organization is introduced. Finally, it proposes practical example on building knowledge organization system in digital library.

  • article
    Ma Jianxia
    New Technology of Library and Information Service. 2009, 25(4): 33-39. https://doi.org/10.11925/infotech.1003-3513.2009.04.07
    Abstract (1428) PDF (623) HTML (4)   Knowledge map   Save

    This paper introduces several standards of compound digital object,METS,MPEG-21 DIDL and OAI-ORE. The basic data models, applications and characters of these standards are analyzed and their processes of digital objects are compared.

  • article
    Lai Maosheng,Qu Peng
    New Technology of Library and Information Service. 2009, 25(4): 50-56. https://doi.org/10.11925/infotech.1003-3513.2009.04.10
    Abstract (2069) PDF (1036) HTML (3)   Knowledge map   Save

    The paper analyzes the query logs in March, 2007, from Sogou search engine. POS tagging is used to get the characters of high frequency POS results. Web users use nouns as primary and verbs as complementary methods in Web queries; but other parts of speech seldom appear in the queries. The empty words in natural language, such as “的”, do not appear in the high frequency POS results very often. Queries in the Web searching are different from natural language in syntax to a certain degree and they have shared characters at the same time. Web users’ use nouns to do concept-focused retrieval and keywords are still the primary method to search on the Web. The high frequency results of POS tagging partially obey the Zipf’s law.

  • article
    Li Guangjian
    New Technology of Library and Information Service. 2009, 25(6): 2-7. https://doi.org/10.11925/infotech.1003-3513.2009.06.02
    Abstract (1592) PDF (639) HTML (7)   Knowledge map   Save

    After introducing the background of the NSTL Service Integration System Embedded in Information Institutions, this paper proposes the design scheme, architecture and main functions of the system, and then discusses the key technologies for the implementation of the system with respect to the resources integration, the services integration and the distributed knowledge bases management. Finally, the paper briefly reports the application of the system in a library.

  • article
    Qi Huiying,Mu Qiujiang,Li Yazi
    New Technology of Library and Information Service. 2009, 25(6): 8-13. https://doi.org/10.11925/infotech.1003-3513.2009.06.03
    Abstract (1523) PDF (716) HTML (7)   Knowledge map   Save

    According to the implementation of the NSTL Service Integration System embedded in Information Institutions, this paper introduces the methods of system interoperability and results merging during integration, and discusses the related technologies in the methods. Finally, the paper presents the test results of the system performance. The practice shows that the methods are feasible.