Data Analysis and Knowledge Discovery

Select

Factors Influencing Mobile E-commerce Consumers’ Preferences: An Empirical Study

Zhu Peng,Zhao Xiaoxiao,Wu Wei

Data Analysis and Knowledge Discovery. 2017, 1(3): 1-9. https://doi.org/10.11925/infotech.2096-3467.2017.03.01

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper tries to explore the impacts of motivation, product types, and marketing strategy, as well as their interactions on the shopping preferences of mobile e-commerce consumer’s. [Methods] We used scene-based questionnaire to collect the needed data. [Results] We found that the interaction between product types and marketing strategy posed significant effects to mobile e-commerce consumer’s purchase preferences. [Limitations] We did not include other influencing factors such as product involvement, individual cognitive demand and perceived risks in this study. [Conclusions] This paper provides advice to mobile E-commerce product vendors from the perspectives of consumers, products and marketing strategies.

Select

Recommending Potential R&D Partners Based on Patents

Zhai Dongsheng,Guo Cheng,Zhang Jie,Xia Jun

Data Analysis and Knowledge Discovery. 2017, 1(3): 10-20. https://doi.org/10.11925/infotech.2096-3467.2017.03.02

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This study presents a recommendation method based on patents to accurately identify potential R&D partners. [Methods] First, we extracted the functions, scientific impacts and functional effects of the related patents based on the TRIZ theory. Second, we constructed a patent technology tree, which was mapped with key information from the enterprise needs. Finally, we identified and evaluated the potential R&D partners in accordance with the patentee. [Results] We successfully assessed the retrieved R&D partners with the proposed method based on water heater related patents. [Limitations] The accuracy of semantic feature extraction needs to be improved. [Conclusions] The proposed method could find and evaluate the potential R&D partners for enterprises effectively.

Select

Visualization of Coalition Data Based on Multi View Cooperation

Shen Xuefeng,Ke Yongzhen,Yao Nan

Data Analysis and Knowledge Discovery. 2017, 1(3): 21-28. https://doi.org/10.11925/infotech.2096-3467.2017.03.03

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper proposes a data visualization model to retrieve, analyze and present historical records from a data coalition, aiming to improve the knowledge discovery. [Methods] We constructed a model for the visual data analysis system, and then used a big data platform to examine its feasibility. [Results] The proposed system could analyze massive historical data and then support the decision making procedures. [Limitations] The current visual analysis result views could be further improved by adding more chart templates. [Conclusions] The proposed system could analyze historical data from the library alliance and provide valuable information for decision makers.

Select

Extracting and Visualizing Knowledge Graph Schema from Linked Data with Cytoscape Platform

Jiang Ying,Zhang Jing,Zhu Lingxuan

Data Analysis and Knowledge Discovery. 2017, 1(3): 29-37. https://doi.org/10.11925/infotech.2096-3467.2017.03.04

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper proposes a new method to generate knowledge graph schema, aiming to help us understand the data structure before submitting a query, and improve the perfornamce of linked data retrieval. [Methods] First, we searched knowledge relations of the linked data through SPARQL. Second, we constructed knowledge graph schema triples for each identified relation. Finally, we extracted graphs schema triples from every knowledge class and merged them with those of the relations. [Results] A Cytoscape plugin was developed based on the proposed method to visualize the knowledge graph schema. [Limitations] Our method could not extract knowledge from complex classtification, such as anonymous nodes. [Conclusions] The proposed method was examined with biomedical data for single, inclusive, and bridge extractions. It could retrieve information effectively, and does not need additional crawling and index efforts.

Select

Personalized Recommendation Algorithm Based on Modified Tensor Decomposition Model

Chen Meimei,Xue Kangjie

Data Analysis and Knowledge Discovery. 2017, 1(3): 38-45. https://doi.org/10.11925/infotech.2096-3467.2017.03.05

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper tries to improve the prediction accuracy of personalized recommendation algorithm based on the tensor decomposition model. [Methods] First, we proposed a new tensor model using spectral clustering technique based on combined tag co-occurrence. Second, we established a penalty scheme on popular tag and resource co-occurrence with the help of IDF in TF-IDF. Finally,we re-defined the initial tensor on the triplets of user, tag cluster, and resource. [Results] We examined the proposed model with dataset from Last.fm and found its precision, recall and F1 measure outperformed other algorithms. The F1 measures were increased by 5.91% and 1.29% thanks to the two proposed modifictions based on clustering and IDF. [Limitations] The proposed algorithm should be further evaluated with datasets from Weibo, Delicious, and other resources. [Conclusions] The new algorithm based on advanced tensor decomposition model could significantly improve the accuracy of resources recommendation to satisfy social network system users’ information needs.

Select

Sentiment Analysis of Trending Topics Based on Relevance

He Yue,Xiao Min,Zhang Yue

Data Analysis and Knowledge Discovery. 2017, 1(3): 46-53. https://doi.org/10.11925/infotech.2096-3467.2017.03.06

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper tries to effectively analyze the sentiment of trending topics with machine learning techniques. [Methods] First, we proposed a new classification model based on trending topic relevance to extract subjective microblog posts. Second, we analyzed sentiment tendency with an improved machine learning method. [Results] We found that the modified model improved the subjective-objective classification of trending topics. The F-measures were increased by 7.4% and 2.2% respectively. [Limitations] More research is needed to study the distribution of data, the particle of emotion and the changes of sentiment trends. [Conclusions] Adding topic relevance factor to the model could improve the performance of sentiment analysis of micro-blog posts, and extract tendency of key objects from the trending topics, which provides intelligence for micro-blog marketing.

Select

Extracting Events of Food Safety Emergencies with Characteristics Knowledge

Wang Dongbo,Wu Yi,Ye Wenhao,Liu Ruilun

Data Analysis and Knowledge Discovery. 2017, 1(3): 54-61. https://doi.org/10.11925/infotech.2096-3467.2017.03.07

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper aims to extract the events of food safety emergencies from large food safety emergencies. [Methods] First, we built the food safety emergency corpus based on the past events, as well as the data acquisition, labeling, and organization methods of information science. Then, we extracted the corresponding events with the help of conditional random field model, and the distribution characteristics knowledge of the food safety emergencies. [Limitations] We might not be able to apply the feature template created by this research to other fields. [Results] We examined the proposed model with a food safety emergency corpus of 15 million Chinese words, and the F value of this model reached 91.94%. [Conclusions] It is feasible for us to extract the events from food safety emergency corpus with the help of conditional random field model.

Select

The Impacts of Reviews on Hotel Satisfaction: A Sentiment Analysis Method

Wu Weifang,Gao Baojun,Yang Haixia,Sun Hanlin

Data Analysis and Knowledge Discovery. 2017, 1(3): 62-71. https://doi.org/10.11925/infotech.2096-3467.2017.03.08

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper analyzes the online hotel reviews to identify the factors influencing the customer’s satisfaction, and then provides suggestion to the management. [Methods] First, we extracted features and reduced dimensionality of travelers’ comments from Tripadvisor.com with the help of Word2Vec technique. Secondly, we extracted the characteristics of each type of the corresponding emotion based on sentiment analysis technology. Finally, we constructed an econometric model to analyze the correlation between the hotel reviews and users’ satisfaction. [Results] We found that positive reviewers were generally satisfied with the hotel service, however, there was no linear relations between the two factors. The more feature categories mentioned by the user in comments, the more likely he or she was not satisfied. The consumers paid more attention to the staff of the luxury hotels, while cared the cleanliness of the economic ones. Consumers’ attitudes towards luxury hotels were significantly affected by the Internet, which posed less obvious influences to the economic ones. [Limitations] The sample was not comprehensive, and more studies are needed to analyze data from multiple cities. [Conclusions] This study lays theoretical foundation for the online word-of-mouth research from the perspective of user generated contents.

Select

Chinese Stopwords for Text Clustering: A Comparative Study

Guan Qin,Deng Sanhong,Wang Hao

Data Analysis and Knowledge Discovery. 2017, 1(3): 72-80. https://doi.org/10.11925/infotech.2096-3467.2017.03.09

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper compares and analyzes the impacts of stopwords on textual data processing, aiming to improve the construction and use of stopwords. [Methods] We obtained stopword lists from Baidu Search Engine, Harbin Institute of Technology and the Machine Learning Laboratory of Sichuan University for this study. First, we processed text message with the stopword lists and Chinese word segmentation technique, the TF-IDF feature evaluation function and the VSM vector model. Secondly, we analysed the texts with the K-means algorithm to calculate the P, R and F1 values. [Results] Different stopword lists posed various effects to the text data processing tasks. The length of the list and the content structure of the texts directly influenced the clustering results. More importantly, the two-character stopwords was the biggest factor. [Limitations] The text types and quantity were limited. More research is needed to analyze the text with different types of stop words. [Conclusions] Stopword list poses significant impacts on text clustering, thus, it is extremely important to build or choose the appropriate Chinese stopword list. However, excessively increasing the number of stop words might not always improve the clustering results.

Select

Knowledge Search for Cloud Computing Industry Alliance: An Algorithm Based on Improved Particle Swarm Optimization

Gao Changyuan,Yu Jianping,He Xiaoyan

Data Analysis and Knowledge Discovery. 2017, 1(3): 81-89. https://doi.org/10.11925/infotech.2096-3467.2017.03.10

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper uses an algorithm based on the improved particle swarm optimization to conduct knowledge search for cloud computing industry alliance, aiming to improve its accuracy and efficiency. [Methods] First, we utilized the Map function of the MapReduce model to process particle grouping. Secondly, we used the Reduce function to shorten the particle search result lists and search time. Lastly, the information interaction of the particles was decided by the average value of the optimal position within each group, which avoided the premature convergence of using a local optimal value. [Results] We compared the performance of the improved algorithm with the standard ones by three rounds of simulation experiments. We found that the improved particle swarm algorithm was superior in efficiency and accuracy. [Limitations] There is some noisy data in the sample. [Conclusions] The proposed algorithm could improve the accuracy and efficiency of knowledge search for the cloud computing industry alliance.

Select

Analyzing Website Navigation Features of Top U.S. Academic Libraries

Yin Xiangquan,Li Shuning

Data Analysis and Knowledge Discovery. 2017, 1(3): 90-95. https://doi.org/10.11925/infotech.2096-3467.2017.03.11

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper studies the navigation features of top academic library websites from the United States, aiming to improve the services of their Chinese counterparts. [Methods] First, we identified library websites of the top 15 U.S. universities and downloaded their navigation texts. Second, we analyzed the similarities and differences among these texts with tag cloud and Vector Space Model. Finally, we examined our findings with the “2016 State of America’s Libraries Report”. [Results] The proposed method was intuitive and generated analysis results fast, which could be further processed with text mining techniques. [Limitations] Only retrieved the first and, second levels of navigation as well as titles of the homepages. [Conclusions] The proposed model provides useful information for the academic libraries in China.

Please choose a citation manager

Content to export

25 March 2017, Volume 1 Issue 3

模态框（Modal）标题

Please choose a citation manager

Content to export

25 March 2017, Volume 1 Issue 3