Data Analysis and Knowledge Discovery

Select

Router Service Engine iSwitch for Open Access Articles: Articles Reception and Resolving

Shi Hongbo, Qian Li, Zhang Xiaolin, Liang Na

New Technology of Library and Information Service. 2015, 31(6): 1-6. https://doi.org/10.11925/infotech.1003-3513.2015.06.01

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper provides the implementation of iSwitch System's Reception and Resolving. [Methods] Based on the investigation and analysis of technology, standards, and the key problems, this paper designs and implements the Reception and Resolving of iSwitch. [Results] Implement the Reception and Resolving portions of iSwitch System and make a test based on Web of Science articles. [Limitations] The problems and difficulties may encountered in real service are not considered enough. [Conclusions] Articles' affiliation resolving is a common problem for library and information study, and the solution of this paper has reference value for other similar system's design.

Select

Router Service Engine iSwitch for Open Access Articles:Pushing and Routing

Qian Li, Shi Hongbo, Zhang Xiaolin, Liang Na

New Technology of Library and Information Service. 2015, 31(6): 7-12. https://doi.org/10.11925/infotech.1003-3513.2015.06.02

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] To route Open Access articles which are received and parsed successfully to Institute Repository of Author and Funding. [Methods] Analyze the technology framework of iSwitch, then design the service architecture and function interface of Routing Articles, and finally use task agent and FTP to realize Articles Routing. [Results] The 34 332 articles from Web of Science, finally are routed successfully by iSwitch. [Limitations] Now Articles Routing is only based on one data source, but the consideration is not enough that the problems may happen in future service. [Conclusions] The experiment results show that workflow mechanism of pushing and routing is correct and its efficiency can meet the future demand for services.

Select

Research on Collaborative Filtering Personalized Recommendation Method Based on User Classification

Zhu Ting, Qin Chunxiu, Li Zuhai

New Technology of Library and Information Service. 2015, 31(6): 13-19. https://doi.org/10.11925/infotech.1003-3513.2015.06.03

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] To solve the problem of low efficiency of the algorithm with the increasing number of users. [Methods] This paper proposes a method of collaborative filtering based on user classification. Firstly, the huge users are classified into several groups according to a rule-based classification method. Then, with the guarantee of recommendation accuracy, the local neighbor users are discovered for users. Finally, based on the discovered local neighbors, personalized recommendation is conducted. [Results] User classification and recommendation accuracy are evaluated by F₁and MAE separately. The algorithm efficiency is evaluated according to the time complexity. Experimental results show that with the adoption of a rule-based user classification, collaborative filtering algorithm significantly improves with the guarantee of user classification accuracy and recommendation accuracy. [Limitations] The recommendation accuracy is reduced a little bit. The proposed method is only tested on MovieLens data set, and it needs further validation in other data sets. [Conclusions] This method reduces the computation of local neighbors user identification, while improves the efficiency of the algorithm.

Select

A Hybrid Recommendation Method Combining Collaborative Filtering and Content Filtering

Gao Huming, Zhao Fengyue

New Technology of Library and Information Service. 2015, 31(6): 20-26. https://doi.org/10.11925/infotech.1003-3513.2015.06.04

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper explores a new method combining two basic recommendation algorithms to improve the recommendation accuracy of the personalized recommendation method. [Methods] The trusted neighbors can be obtained by putting forward a calculation method of the project heat to optimize the algorithm of Pearson Correlation Coefficient and establishing the interest model for the current users and its neighbors. [Results] The experiment set in MovieLens 1M movie rating data shows that the hybrid recommendation method proposed in this paper can acquire better recommendation accuracy than the exist two kinds of hybrid recommendation methods. [Limitations] The unique characteristics of the projects need to be selected by different people who may have different opinions to the number of the characteristics and their weight distribution in the interest model. [Conclusions] The hybrid recommendation method proposed in this paper improves the recommendation accuracy of the personalized recommendation.

Select

A Hybrid Collaborative Filtering Recommender Based on Item Rating Prediction

Ying Yan, Cao Yan, Mu Xiangwei

New Technology of Library and Information Service. 2015, 31(6): 27-32. https://doi.org/10.11925/infotech.1003-3513.2015.06.05

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] By improving the traditional collaborative filtering recommendation algorithm to alleviate the existing data sparseness problem, thus enhance the prediction precision. [Methods] This paper proposes a hybrid collaborative filtering recommender framework and KSUBCF algorithm integrated K-means clustering and Slope One algorithm. Firstly, this algorithm uses the Slope One algorithm based on K-means clustering to predict item default rating. And then, to implement recommendation by the collaborative filtering recommendation algorithm based on users. [Results] The experimental results show that with the increase of neighbors numbers, this algorithm is better than the original Slope One algorithm, which MAE value is reduced by 8.8% to 21% and RMSE value is reduced by 17% to 28.1%. [Limitations] This algorithm still relies on user-project score data matrix. [Conclusions] Compared with other traditional collaborative filtering algorithms, the decreases of the MAE value are 10% and 43.8% respectively and the decreases of the RMSE value are 20.1% and 37.4%. The proposed method can improve the prediction precision.

Select

Ontology Matching for Linked Data Set

Gao Jinsong, Cheng Ya, Liang Yanqi

New Technology of Library and Information Service. 2015, 31(6): 33-40. https://doi.org/10.11925/infotech.1003-3513.2015.06.06

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] The paper analyzes the characters of linked data set to improve the traditional Ontology matching method. [Methods] Combine the Ontology matching methods as matching rules from three aspects, which are method of data transformation, similarity of name and similarity of the description information, then use the genetic algorithm to extract the best matching rules, finally use Jena to test. [Results] Construct an Ontology matching framework for linked data set, and realize the interconnection between Ontologies of linked data set. [Limitations] The Ontology matching process mainly solves the problem of heterogeneous Ontologies, failed to match the Ontologies in different fields and languages. [Conclusions] The method can realize the correlation of the linked data set and improve the links of linked data set.

Select

Research on Image Semantic Mapping with Multiple-Reservoirs Echo State Network

Wang Huaqiu, Wang Bin, Nie Zhen

New Technology of Library and Information Service. 2015, 31(6): 41-48. https://doi.org/10.11925/infotech.1003-3513.2015.06.07

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] The mapping between low-level visual feature and high-level semantic information is built up to fill the “semantic gap” of image retrieval and improve accuracy. [Methods] Referring to the idea of ensemble learning, Multiple-Reservoirs Echo State Networks (MESN) is applied to semantic mapping model. After the low-level visual features of images are divided by feature types and trained by different reservoirs, the training results are combined linearly. [Results] Compared to BP Neural Network and traditional Echo State Network, the average error rate of MESN decreases by 31.64% and 19.28% respectively, the precision rate increases 4.56% and 1.86% respectively. [Limitations] The parameters of reservoirs are set artificially. Parameter optimization algorithm isn't constructed. [Conclusions] Experimental results show that the semantic mapping model of Echo State Networks with Multiple-Reservoirs is effective.

Select

Named Entity Recognition from Search Log

Ren Yuwei, Lv Xueqiang, Li Zhuo, Xu Liping

New Technology of Library and Information Service. 2015, 31(6): 49-56. https://doi.org/10.11925/infotech.1003-3513.2015.06.08

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] Recognizing the named entity in the search logs provides great value and significance for enhancing the quality of search service. [Methods] Extract candidate named entity by using seed named entity and template matching principle. After clustering the candidate named entity, extracte the recognition feature of candidate named entity, including the frequency, the number of different templates and template weight. Fuse these features to construct calculation formula of named entity recognition weight and adjust feature influencing parameters reasonably. [Results] By marking and counting the extracted named entity, the average value of P@500 reaches 75% and is higher than Pa?ca method by 7%. [Limitations] The named entity which has weak sensitivity for the template can not be extracted correctly. [Conclusions] Calculate the P@N index value of the extracted results, which shows the effectiveness of this method.

Select

Research on Rule-based Normalization of Institution Name

Yang Bo, Yang Junwei, Yan Sulan

New Technology of Library and Information Service. 2015, 31(6): 57-63. https://doi.org/10.11925/infotech.1003-3513.2015.06.09

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] To improve the data reliability in large-scale academic assessment and the performance of word-similarity or frequency based techniques in institution name normalization. [Methods] A new rule-based algorithm aided with low-value word similarity is proposed and a series of rules and statistical methods are applied jointly to mapping multiple institution names onto one entity of institution, so as to make institution name normalized. [Results] The experimental results show that the F-value of the rule-based algorithm (55.50%) is higher than the other two strategies. [Limitations] The ability to identify institution names with low value of word similarity is not good enough. [Conclusions] The rule-based algorithm proposed performs better than the other two techniques comprehensively, while the recall value needs to be improved.

Select

A Noise Cleaning Method for Synonym Extraction Results

Liu Wei, Wang Xing, Song Peiyan

New Technology of Library and Information Service. 2015, 31(6): 64-70. https://doi.org/10.11925/infotech.1003-3513.2015.06.10

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] There are lots of noises in synonym extraction results, and the noises would hurt the availability of extraction results. [Methods] This paper proposes a noise cleaning solution based on synonym graph. The proposed method firstly transforms synonym extraction results into an undirected synonym graph, and then detects the noises in the graph. The method is improved by incorporating the distribution similarity. [Results] The terms randomly selected from the technique field are used in the experiments, and the experiments show that this method can remove noises from the synonym extraction results to some extend. [Limitations] Only part of noises is cleaned, hence the accuracy of detecting noises needs be increased by improving the methods. [Conclusions] This is a feasible approach to clean the noises in the synonym extraction results, which is worth further study.

Select

Feature Recognition of Niche Expert——Empirical Analysis Based on MetaFilter Dataset

Li Gang, Ye Guanghui, Zhang Yan

New Technology of Library and Information Service. 2015, 31(6): 71-77. https://doi.org/10.11925/infotech.1003-3513.2015.06.11

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] In order to fully get expert resource, this paper explores the feature recognition method of niche expert. [Methods] Firstly, take advantage of user activity data from a famous community weblog named MetaFilter to construct user interaction network. Secondly, make statistics of node network structure indexes, such as betweenness centrality, clustering coefficient. Finally, feature and role of node in different period is distinguished via the combination of cluster analysis and time series analysis. [Results] This paper obtains the niche expert collection through comparative analysis of network statistics indexes of different clusters, the classification of niche experts are further refined based on temporal changes in the collection. [Limitations] Role identification and migration analysis should be expanded to more sections, not only the music section, so that the “stability-change” feature of niche experts under different semantic circumstance can be further discussed. [Conclusions] Niche expert is an effective supplement to the existing collection of experts, the method proposed in this paper can be applied to many aspects, such as the construction of expert team, the recommendations and retrieval of experts, and so on.

Select

Detect of Internet Fake Public Opinion Based on Decision Tree

Zhao Jingxian

New Technology of Library and Information Service. 2015, 31(6): 78-84. https://doi.org/10.11925/infotech.1003-3513.2015.06.12

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] A method to detect Internet fake public opinion based on combined optimization decision tree is proposed. [Methods] It gives three definitions of fake public opinion based on the analysis of characteristics, namely A, B and C. Evaluation index is constructed and decision tree is established by discretization, attributes selection of normalization input-output correlation value. [Results] The test on Matlab shows the model based on combined optimization decision tree has higher predict accuracy. [Limitations] The model and data focus on network media. The rise of mobile social software may change the features of fake public opinion which needs further improvement to the method. [Conclusions] The paper proposes a new method for intelligent multiple classification of fake public opinion.

Select

Simulation Research on WeChat Information Diffusion Based on Intelligent Multi-agent Networks

Wang Xiaoli

New Technology of Library and Information Service. 2015, 31(6): 85-92. https://doi.org/10.11925/infotech.1003-3513.2015.06.13

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] Based on the analysis of the new characteristics of WeChat which are different from social media platforms, this paper researches the mechanism of WeChat information diffusion with the simulation method. [Methods] The complex network is established through the analysis on the interactive rule of information, the agent model is built upon the work searching for relevant variables, and three evolution rules based on these variables are presented for the information interaction between agents. [Results] The simulation experiments demonstrate that the simulation results coincide with the information diffusion macroscopic features of WeChat and the proposed primary variables have enlightenment to both controlling and using WeChat information diffusion. [Limitations] All the variables which affect the information diffusion can not be introduced, and there is discrepancy between the complex network and the real world social network due to the lack of the data. [Conclusions] This study reveals the mechanism of WeChat information diffusion and contributes to controlling and making better use of WeChat.

Select

“State-Behavior” Modeling and Its Application in Analyzing Product Information Seeking Behavior of E-commerce Websites Users

Yuan Xingfu, Zhang Pengyi, Wang Jun

New Technology of Library and Information Service. 2015, 31(6): 93-100. https://doi.org/10.11925/infotech.1003-3513.2015.06.14

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This research aims to develop an approach to model and describe the user information behaviors during information seeking, product comparison, and decision-making process more systematically and precisely. [Methods] This paper proposes a user “state-behavior” model including sequential, temporal, and content features. Test data set includes the click-through log data of 4 710 users from taobao.com. The user behavior sequences are established by mapping page types and user behaviors, and then used as features to model users' “status-behavior” at the session level. [Results] Classification using the “state-behavior” model resulted 8 user groups with significant features, including swift searchers, serendipitous browsers, promotion-driven users, personal information maintainers, weekday-active users, weekend-active users, night-active users, and irregular users. [Limitations] Adding a session layer between logs and user behavior may cause accumulation of classification errors at the session level into the behavior level. [Conclusions] The results show that this model is able to capture the behavior sequence more precisely. The classification of users may be used in guiding personalized recommendation and marketing plans for e-commerce Websites.

Please choose a citation manager

Content to export

25 June 2015, Volume 31 Issue 6

模态框（Modal）标题

Please choose a citation manager

Content to export

25 June 2015, Volume 31 Issue 6