Data Analysis and Knowledge Discovery

Select

Predicting Conversion Rate of APP Advertising with Machine Learning

Zhao Yang,Yuan Xini,Chen Yawen,Wu Liqiang

Data Analysis and Knowledge Discovery. 2018, 2(11): 2-9. https://doi.org/10.11925/infotech.2096-3467.2018.0834

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper tries to predict the conversion rate of APP advertisements with the help of machine learning algorithms, aiming to improve the effectiveness of advertising and marketing activities. [Methods] First, we examined the characteristics of APP advertisements. Then, we applied four machine learning algorithms to predict their conversion rate. The proposed RF+LXFV model was built with Random Forest, Gradient Boosting Decision Tree, Random Forest, LightGBM, XGBoost, Vowpal Wabbit and Field-aware Factorization Machine. Finally, we evaluated the validity and accuracy of the new model with Tencent APP advertising data. [Results] The prediction results of the proposed model achieved higher accuracy than those of the single algorithm. [Limitations] We did not examine the impacts of advertising transformation delay on prediction. [Conclusions] The proposed RF+LXFV model could predict the conversion rate of APP advertising effectively.

Select

Predicting Repeat Purchase Intention of New Consumers

Zhang Liyi,Li Yiran,Wen Xuan

Data Analysis and Knowledge Discovery. 2018, 2(11): 10-18. https://doi.org/10.11925/infotech.2096-3467.2018.0823

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper compares the prediction accuracy and efficiency of different machine learning algorithms, aiming to identify new consumers with repeat purchase intentions. It also provides a theoretical framework for customer classification. [Methods] First, we collected the server logs of a dealer on Taobao.com from 2015 to 2018, as well as its orders and consumers’ personal information. And then, we used different algorithms to train the proposed models. [Results] The SMOTE algorithm combined with the random forest algorithm obtained the highest prediction accuracy of 96%. [Limitations] The sample data size needs to be expanded. [Conclusions] The fusion algorithm based on SMOTE and random forest has better performance in predicting repurchase intentions of new consumers.

Select

Examining Consumer Reviews of Overseas Shopping APP with Sentiment Analysis

Zhao Yang,Li Qiqi,Chen Yuhan,Cao Wenhang

Data Analysis and Knowledge Discovery. 2018, 2(11): 19-27. https://doi.org/10.11925/infotech.2096-3467.2018.0835

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper analyzes the sentiment of online reviews, and then evaluates the consumer’s satisfaction with overseas shopping APP, aiming to improve its performance. [Methods] First, we collected reviews of these APPs from the APP Store. Then, we clustered the APPs’ attributes with Canopy and K-means algorithms, which defines the evaluation dimensions of consumer’s satisfaction. Finally, we computed scores of the consumer’s satisfaction with the CNN-SVM sentiment analysis model. [Results] The most important factor affecting the consumer’s satisfaction with overseas shopping APP was commodities, followed by price, interaction, service, and logistics. The consumer’s satisfaction level with the vertical overseas shopping APP was higher than that of the overseas buyer APP and the comprehensive overseas shopping APP. The consumer’s satisfaction level is relatively low with logistics and services. [Limitations] More overseas shopping APP should be included in future research. [Conclusions] The sentiment analysis method is an effective way to analyze consumer’s satisfaction with online reviews of overseas shopping APP.

Select

Recommendation Algorithm for Post-Context Filtering Based on TF-IDF: Case Study of Catering O2O

Yin Cong,Zhang Liyi

Data Analysis and Knowledge Discovery. 2018, 2(11): 28-36. https://doi.org/10.11925/infotech.2096-3467.2018.0832

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper carries out an in-depth study on context-integrated and personalized recommendation, aiming to address the issue of information overload. [Methods] We proposed a new contextual preference prediction model based on TF-IDF algorithm for post-context filtering, as well as the contextual association probability and universal importance. Then, we adjusted the initial scores of traditional recommendation with the help of item category preferences to generate the final list. [Results] We conducted an empirical study on catering industry and found that the proposed algorithm yielded better results. [Limitations] The accuracy of the context association needs to be improved. [Conclusions] Context information plays an important role in user behavior and decision making. More research is needed to improve the personalized recommendation based on context modeling.

Select

Evaluating and Optimizing Supply Chains with LMBP Algorithm

Meng Hu,Liang Xiaobei,Yang Yixiong,Li Min

Data Analysis and Knowledge Discovery. 2018, 2(11): 37-45. https://doi.org/10.11925/infotech.2096-3467.2018.0833

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper uses the LMBP algorithm of feedback neural network to evaluate and optimize the supply chains, aiming to improve the decision-making of enterprises. [Methods] First, we built an evaluation model for supply chains. Then, we generated 21 indicators for corporate performance based on this model. Third, we used the MATLAB to evaluate this algorithm. [Results] The proposed method helped enterprises obtain the results of performance analysis in time, and then improved the management of procurement, inventory, and sales. It reduced the operation costs of enterprises, and improved the decision making process. [Limitations] The new method should be examined with more cases. [Conclusions] The proposed method could improve the performance of supply chains.

Select

Impacts of Landlords on Tenants of Short-term Rentals

Liang Xiaobei,Xu Zhen,Li Jingjing

Data Analysis and Knowledge Discovery. 2018, 2(11): 46-53. https://doi.org/10.11925/infotech.2096-3467.2018.0836

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This study explores the influences of room owners on their tenants’ Electronic Word of Mouth(eWOM). [Methods] First, we retrieved data from Airbnb with the help of a Web crawler. Then, we proposed a Poisson Regression model based on the signal theory. Finally, we studied the impacts of room owners’ service on consumers’ eWOM. [Results] The eWOM of the available rooms was positively correlated with features introduction, after-sales interaction, instant reservation, calendar update, response time, high-quality service and identity certification. [Limitations] More samples from regions outside of Beijing should be included. [Conclusions] The proposed model could improve the service of short-term rentals.

Select

Detecting Relationship Among WeChat Group Members with Co-occurrence of Cooperation

Li Gang,Wang Xiao,Guo Yang

Data Analysis and Knowledge Discovery. 2018, 2(11): 54-63. https://doi.org/10.11925/infotech.2096-3467.2018.0320

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper analyzes the implicit relationship among WeChat group members and meaures its strength, which is also combined with their explicit relatinship to describe the social network characteristics of WeChat groups. [Methods] First, we collected chatting records from one WeChat interest group. Then, we used the co-occurrence to measure the implicit relationship and the salton index to calculate their strength. Third, we analyzed the discussion participation to explore the implicit-relationship distribution. Finally, we compared the full-relationship network with explicit-relationship network. [Results] We found that topic discussion clearly reflected relationship among group members. Posting more relevant topics helps to manage and maintain membership. [Limitations] More research is needed to measure goup members’ engagement. [Conclusions] The full-network with implicit and explicit relationship reveals more insights on the structure of WeChat group.

Select

Analyzing Scientific Literature with Content Similarity - Topics over Time Model

He Weilin,Feng Guohe,Xie Hongling

Data Analysis and Knowledge Discovery. 2018, 2(11): 64-72. https://doi.org/10.11925/infotech.2096-3467.2018.0292

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper studies the topics of scientific literature and then tracks their changes.[Methods] We used the improved CSToT Model (Content Similarity - Topics over Time), to analyze scholarly papers from 9 information science journals in China published from 2012-2016. [Results] The CSToT model effectively revealed the subject structure of scientific literature and the evolution of topics. We also found that majority of the current information science research covers information services, online public opinion and data mining. Their evolution trends include rising, falling, stable and fluctuating patterns, which are particularly prominent in information services research. [Limitations] The training data set needs to be expanded. [Conclusions] The CSToT model could effectively identify the topics of scientific literature and their evolutionary trends, which provide new directions for future research.

Select

Predicting Transactions Among Agents in Patent Transfer Weighted Networks for New Energy

Wu Yuying,Sun Ping,He Xijun,Jiang Guorui

Data Analysis and Knowledge Discovery. 2018, 2(11): 73-79. https://doi.org/10.11925/infotech.2096-3467.2018.0254

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper examines the structure of weighted network for patent transfers as well as the characteristics of agents, aiming to predict transaction opportunities and promote the connection of technology supply and demand. [Methods] First, we constructed a weighted network for patented technology transactions based on data from 2012 to 2016. Then, we used the entropy method to combine its structure and contents. Finally, we used the BP neural network to predict transaction opportunities and weights. [Results] The prediction accuracy by the proposed method, which combined the structure index RA and the content index Cosine, was the highest. The prediction error was also reduced by using the real and structure weights of the network to predict the link weight. [Limitations] More research is needed to study the Node properties and network evolution mechanism. [Conclusions] The link prediction method has a higher precision, which help us find potential supply and demand agents of the technology patent transfers.

Select

Studying Knowledge Dissemination of Online Q&A Community with Social Network Analysis

Wang Zhongyi,Zhang Heming,Huang Jing,Li Chunya

Data Analysis and Knowledge Discovery. 2018, 2(11): 80-94. https://doi.org/10.11925/infotech.2096-3467.2018.0293

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective]This paper analyzes the social network structure and knowledge dissemination mechanism of an online Q&A community, aiming to reveal the role of network nodes, and improve the learning efficiency. [Methods] First, we used the social network analysis and the entropy weight methods to describe the opinion leader’s knowledge and influence. Then, we built a knowledge dissemination model based on the Cowan model for the Q&A community. Finally, we examined the internal knowledge learning results of the network through system simulation. [Results] Ⅰ. The nodes with less knowledge had higher learning efficiency in the target network; Ⅱ. The knowledge volumes of some nodes increased rapidly, while those of the nodes with larger knowledge stock increased slowly; Ⅲ. The knowledge dissemination rate of this network has been decreasing; Ⅳ. There is strong correlation between knowledge increase and the index of knowledge and communication abilities. [Limitations] The dynamic random reconnection of network was not examined in this paper. [Conclusions] This paper offers practical advice to improve users’ learning experience in the online Q&A community.

Select

Choosing Stopwords for Patent Topic Analysis Based on Auxiliary Set

Yu Yan,Zhao Naixuan

Data Analysis and Knowledge Discovery. 2018, 2(11): 95-103. https://doi.org/10.11925/infotech.2096-3467.2018.0240

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper proposes a new method to automatically choose domain specific stopwords, aiming to improve the performance of patent topic analysis. [Methods] First, we introduced an auxiliary set and proposed two indexes of document frequency and entropies among categories based on this auxiliary set. Then, we measured the distribution of words from the auxiliary set to choose the domain specific stopwords automatically. [Results] The proposed method improved the quality of identified patent topics. [Limitations] The types and members of the auxiliary set need to be further studied. [Conclusions] The proposed stopwords selection methods could measure the characteristics of words, which helps us find the domain specific stopwords for patent analysis more effectively.

Please choose a citation manager

Content to export

25 November 2018, Volume 2 Issue 11

模态框（Modal）标题

Please choose a citation manager

Content to export

25 November 2018, Volume 2 Issue 11