Data Analysis and Knowledge Discovery

Select

Characteristics and Development Trends of Papers from “New Technology of Library and Information Service”

Wang Yuefen,Jin Jialin

New Technology of Library and Information Service. 2016, 32(9): 1-16. https://doi.org/10.11925/infotech.1003-3513.2016.09.01

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This study analyses the characteristics and development trends of the papers published by the journal “New Technology of Library and Information Service” in the past 10 years. It tries to offer suggestion for this journal’s future development. [Methods] First, we retrieved papers from “New Technology of Library and Information Service” and similar journals indexed by CNKI, Wanfang Data and WOS databases. Second, we compared the characteristics of these papers. [Results] Technology oriented papers published by “New Technology of Library and Information Service” showed strong support to library and information services. [Limitations] We decided the papers’ themes by their keywords rather than fulltext. [Conclusions] “New Technology of Library and Information Service” should keep its characteristics and promote the development of research and practice of library and information technologies.

Select

Review of Digital Documents Automatic Classification Research

Li Xiangdong,Ba Zhichao,Gao Fan

New Technology of Library and Information Service. 2016, 32(9): 17-26. https://doi.org/10.11925/infotech.1003-3513.2016.09.02

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper discusses the existing issues and possible solutions to the automatic classification of digital documents (i.e. library bibliographies, news pages and social media posts). [Coverage] We reviewed literature on the feature semantics conversion, feature expansion and weighting strategy from the field of Automatic Classification based on machine learning. [Methods] We analyzed the leading studies, key technologies, current achievements, and future directions from the published articles. [Results] Our research found the limits of previous studies on semantic representation of texts and utilization of knowledge bases. [Limitations] We did not discuss the classification algorithms. [Conclusions] To improve the effectiveness of automatic classification of digital documents, future research could try to combine Vector Space Model with Probabilistic Topic Model, use the knowledge base to improve the concept similarity computing, as well as construct composite weighted strategy.

Select

A Knowledge Supply-Demand Simulation System for Collaborative Innovation

Wu Jiang,Chen Jun,Zhang Jinfan

New Technology of Library and Information Service. 2016, 32(9): 27-33. https://doi.org/10.11925/infotech.1003-3513.2016.09.03

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper investigates the network environments facing knowledge-based team as well as their impacts to the job performance. [Methods] First, we constructed a Knowledge Supply & Demand System from the perspective of micro level knowledge management with the multi-agent based simulation technology. Second, we added time and financial costs as the criteria for performance evaluation. We developed this new system with Python NetworkX. [Results] We found that the large organizations reduced more costs of innovation than their small counterparts. Increasing the number of nodes in the neighborhood of individuals did not improve the innovation efficiency. Once the average number of fields exceeded a certain threshold, the cost of innovation began to rise. [Limitations] The study did not optimize interactions among individuals for collaborative innovation. [Conclusions] The proposed Knowledge Supply & Demand System simulates the knowledge integration process of an organization at the micro level. The new system helps us understand knowledge management, improve the efficiency of knowledge utilization, and reduce the cost of innovation.

Select

Using Semantic Model to Build Lexical Chains

Qu Yunpeng,Wang Wenling

New Technology of Library and Information Service. 2016, 32(9): 34-41. https://doi.org/10.11925/infotech.1003-3513.2016.09.04

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper uses Distributional Semantics to build high quality lexical chains. [Methods] First, we built an algorithm using WordNet Thesaurus to compute the semantic relations among language units of the texts. Second, we adopted the Distributional Memory Model to compute their latent semantic relations. Finally, we combined these relations to build the lexical chains, which were examined with papers from medical science. [Results] The proposed algorithm was better than the non-greedy methods to describe the papers’ topics. [Limitations] The efficiency of the algorithm needs to be improved. It should also be examined with papers from other fields. [Conclusions] The proposed model can detect the latent semantic relation, and then improve the quality of lexical chains building with phrases.

Select

Identifying Optimal Topic Numbers from Sci-Tech Information with LDA Model

Guan Peng,Wang Yuefen

New Technology of Library and Information Service. 2016, 32(9): 42-50. https://doi.org/10.11925/infotech.1003-3513.2016.09.05

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper tries to identify the optimal number of topics for the Latent Dirichlet Allocation (LDA) model to analyze scientific and technical information. [Methods] First, we used the topic similarity to measure the differences among the latent topics. Second, we proposed a method determining the optimal topic numbers and tried to utilize this model to documents from Chinese literature in the field of new energy. [Results] The proposed method achieved higher precision ratio and higher F-score in topic extration, which improved the performance of literature recommendation systems. [Limitations] We did not examine the new mothod with other datasets, such as microblog posts and XML documents. [Conclusions] The proposed method could identify more recognizable topics and improve the performance of scientific and technical literature recommendation systems.

Select

Analyzing Travelers’ Preferences for Hotels Based on Structural Topic Model

Yang Haixia,Wu Weifang,Sun Hanlin

New Technology of Library and Information Service. 2016, 32(9): 51-57. https://doi.org/10.11925/infotech.1003-3513.2016.09.06

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper aims to identify various types of travelers’ preferences for hotel services. [Methods] First, we classified the hotels as luxury and budget ones, and then divided the travelers into five categories. Second, we analyzed individual traveler’s rating behaviors on the hotel review website TripAdvisor. Finally, we analyzed the latent topics of hotel reviews with the help of Structure Topic Model (STM) to identify travellers’ preferences for hotel services. [Results] We found that the average rating scores of luxury hotels were higher than the budget ones and travelers did have different preferences for hotel services. [Limitations] The dataset for our study was not large enough. We did not consider the impacts of gender and age to hotel rating and online review contents. [Conclusions] Analyzing travelers’ preferences for hotels could help both the managers and travelers make right decisions.

Select

Identifying Core Users in Social Resource Recommendation System with K-shell Collapse Sequences

Wu Huijuan,Jia Tina Du,Sun Hongfei,Jannatul Fardous

New Technology of Library and Information Service. 2016, 32(9): 58-64. https://doi.org/10.11925/infotech.1003-3513.2016.09.07

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This study aims to identify the core users in social minority groups with the help of social network behavior analysis technique, and then improve the service of social resources recommendation. [Methods] First, we collected 1,208 user tags from the website of Douban Reading, and built co-occurrence matrix for the top 100 tags. Second, we analyzed these users’ K-shell network structure and then investigated its collapse sequences volatility. [Results] We found new core users from the social minority group using the proposed method. [Limitations] The sample data size was relatively small and from only one specific field. The K-shell analysis method needed to be modified to improve the result ranking. [Conclusions] The proposed method could help the social media administrators develop new resources recommendation strategy, and promote the development of social networking systems.

Select

New Collaborative Filtering Recommendation Algorithm Based on User Rating Time

Li Daoguo,Li Lianjie,Shen Enping

New Technology of Library and Information Service. 2016, 32(9): 65-69. https://doi.org/10.11925/infotech.1003-3513.2016.09.08

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper tries to solve the problems facing traditional collaborative filtering algorithm due to sparse data and few users’ common scores, and then improve the accuracy of the score prediction systems. [Methods] First, we identified users with similar scoring behaviors based on their scoring time. Second, we integrated the similarity of user score variance to the calculation of similarity. [Results] The new algorithm, which reduced the MAE by 2% compared to the traditional algorithm, improved the performance of recommendation system. [Limitations] The proposed algorithm was only examined with the MovieLens dataset, which needed to be expanded to other datasets. [Conclusions] The proposed algorithm can improve the effectiveness of recommendation systems.

Select

New Content Recommendation Service of Digital Literature

Liu Jian,Bi Qiang,Liu Qingxu,Wang Fu

New Technology of Library and Information Service. 2016, 32(9): 70-77. https://doi.org/10.11925/infotech.1003-3513.2016.09.09

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper tries to improve the traditional content recommendation service of digital literature, which cannot fully exploit the semantic information of the literature. [Methods] First, we introduced the Ontology reasoning rules to the recommendation system, and then semantically extended the user’s query. Second, we calculated the similarity of the literature to rank. Finally, we recommend those top ranked literature to the users. [Results] The proposed algorithm can calculate the semantic similarity among literature and successful recommend documents to the users. [Limitations] Only examined the new method with relatively small data sets. [Conclusions] The proposed algorithm could effectively exploit the semantic information of target literature and offer a new way to recommend digital resource to the users.

Select

Detecting Disease Associations with Word2Vec from Consumer Health Information

Luo Wenxin,Chen Chong,Deng Siyi

New Technology of Library and Information Service. 2016, 32(9): 78-87. https://doi.org/10.11925/infotech.1003-3513.2016.09.10

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] Average people usually do not know the complex associations among diseases, which poses negative effects to their health information seeking experience. This study tries to detect the associations among diseases using popular medical information with the help of deep learning technology (Word2Vec), aiming to improve personalized information services. [Methods] First, we identified 30 common disease topics with the help of medical professionals, and then collected related reports from Medical News Today. Second, we built word vector for each document with Word2Vec technology to calculate the semantic similarities among them. Finally, we compared the machine training results with experts’ scores to evaluate the performance of the proposed method. We also investigated the impacts of different models, optimization methods, data sizes and important parameters to the results. [Results] The correlation coefficient between the Word2Vec results and the experts’ scores reached 0.635 in optimal condition. We found that Skip-Gram model with less than 20 negative samples on large scale dataset yielded the best results. [Limitations] The precision of the Word2Vec judgment was affected by the number of disease topics. The granularity of disease topic needed to be improved. [Conclusions] The Word2Vec technology could be used to identify diseases association from consumer health information sources. It could also be used to improve the personalized health information services.

Select

Using Bidirectional Pattern Matching Model to Pre-Process Yearbook Data

Shi Liting,Zhang Qian,Zhong Yongheng,Hu Sisi,Li Zhenzhen

New Technology of Library and Information Service. 2016, 32(9): 88-94. https://doi.org/10.11925/infotech.1003-3513.2016.09.11

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] We try to store the yearbook records as structured data, which will also be updated regularly. [Context] The yearbook data pre-process system is a C/S tool platform for collecting, auditing and uploading data. It was developed with VC++, and generated contents for the yearbook database. [Methods] We first modified the classic WM algorithm to build a new bidirectional pattern matching model. With the help of word segmentation technology, the new model could extract the metadata of original records. Then, we reduced the number of pattern sets with data storing procedure and bidirectional matched the records to ensure the effectiveness and efficiency of the system. [Results] The proposed algorithm achieved high level of matching rate and accuracy. [Conclusions] Bidirectional matching algorithm can meet the needs of the yearbook data entry, and improve the efficiency of the data preprocessing system.

Select

Building Library APP with Mobile Platform on Campus——Case Study of Whistle Platform

Sun Rong

New Technology of Library and Information Service. 2016, 32(9): 95-101. https://doi.org/10.11925/infotech.1003-3513.2016.09.12

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This study develops an APP for the academic library based on the mobile campus platform, which helps patrons obtain library information and services conveniently, and then expands the services remarkably. [Context] With the progress of smart-campus technologies, higher education institutes started to build their own mobile campus platforms. However, few libraries utilized this new platform to serve their readers. [Methods] The proposed APP opened interface and interface expansion of the existing library system to provide library services to the mobile patrons, with the help of the Whistle platform. [Results] The proposed APP established a mobile library for the users . It could authenticate library patrons, search/check-out library collection, build personalized library, post announcements and send out information. [Conclusions] Developing library APPs based on the mobile campus platform (i.e. Whistle) mirrors the latest developments of higher education institutions. Academic libraries could become leading players in this field.

Please choose a citation manager

Content to export

25 September 2016, Volume 32 Issue 9

模态框（Modal）标题

Please choose a citation manager

Content to export

25 September 2016, Volume 32 Issue 9