[Objective] This research aims to examine the information seeking behavior patterns and contextual factors of online shoppers’ multi-sessional activities. [Methods] First, we analyzed 1,409,160 logs of an online shopping Web site (generated by 4,285 users) to discover their information seeking behaviors. Second, we used in-depth interviews to explore the users’ motivations. [Results] We found that multi-session shoppers were more likely to check detailed introduction to the products than simply browsing. The average interval between each session was 3 to 4 days. Personal preferences, needs, financial ability and time might lead the users to restore their previous sessions. Searching, shopping carts, bookmarks, browsing and personalized recommendation services were the major channels for users to restore previous sessions. [Limitations] Because of the limited number of participants, results from the interviews might not be generalizable to the whole population. [Conclusions] This research helps us understand the complex online shopping behaviors as well as improve services and user experience of E-commence Web sites.
[Objective] This paper aims to comprehensively explore the knowledge structure and hotspot distribution of different disciplines, with the help of topic extraction and distribution analysis using LDA (Latent Dirichlet Allocation) model from the perspective of subject classification. [Methods] We collected data from the domestic knowledge flow (KF) field and KF related literature from CNKI and Wangfang database, then grouped these data into 11 categories by Chinese Library Classification. Finally, we extracted 20 hot subjects from documents in 11 disciplines with the help of the LDA topic model. [Results] The content and knowledge points in 11 disciplines were obtained from the analysis of 20 extracted hot topics. [Limitations] We did not compare the proposed method with topic mining research in other fields. The domestic KF hotspots found by our study were not compared to the previous findings. [Conclusions] The proposed method can help us explore the knowledge structure and research trends more comprehensively.
[Objective] This paper aims to identify the influential users in social network systems, which could help us maximize the online advertising effects. [Methods] First, we constructed the basic graphs to describe relationship among the social network system users from the perspective of social capital measurement. Second, we built the influence measurement model based on the newly constructed graphs. Finally, we identified the influential users by calculating the probabilities of users’ random browsing behaviors. [Results] The proposed method could identify users with big online influence. They were more capable of affecting others in related fields than the influential users listed by the Sina Weibo. [Limitations] The proposed method did not evaluate the impacts of user-generated contents in social network systems while measuring the users’ influence. [Conclusions] The proposed method could help business owners identify influential users in the social network system to improve the effectiveness of online advertisements.
[Objective] This study aims to solve the problems of the existing pre-release box office prediction models due to data constraints and other factors. [Methods] We first retrieved microblog comments, and then used SVM to identify explicit consumer intention, namely strong positive comments. Second, we modified the traditional sentiment classification schemes to build a Chinese microblog sentiment dictionary based on HowNet. Finally, we defined a new user influence feature and used the BP neural network to predict box office. [Results] The proposed model could forecast the opening box office more accuately. [Limitations] Due to inadequate corpus, the sentiment dictionary may not work well for all microblog movie comments. A dynamic forecasting model was not established between the pre-release and post-release period. [Conclusions] The proposed model can effectively predict opening box office.
[Objective] This paper aims to effectively extract multi-dimensional characteristics of online reviews and then examine the impact of text content to the review quality evaluation. [Methods] First, we quantified and extracted content features based on the textual and sentimental message from the reviews. Then, adopted the GBDT model to evaluate the influence of feature sets to classification results, along with greedy feature selection procedure to identify the most effective content features. Finally, we examined the influences of these features. [Results] The proposed method could improve the performance of review quality evaluation tasks, especially the recall and precision of the new system. [Limitations] Our research focused on review data from search services, and did not investigate products like movies and music. [Conclusions] The information gained from reviews and product feature words, degree of sentimental objectiveness, and differences among review contents all posed important effects to review quality evaluation.
[Objective] This paper aims to establish an automatic method to identify research area groups and outline the science map quickly. [Methods] First, we used feature words to measure topic similarity, and then divided adjacent research areas with similar/related topics into groups. Second, we designed an effectiveness evaluation index to compare different optimal parameters combination. [Results] The proposed method could identify research area groups in science maps effectively. [Limitations] Our study was conducted with data from Mapping Science Structure 2015. More research is needed to investigate the proposed method’s compatibility with other cases. [Conclusions] The proposed method could automatically identify research area groups in the science map.
[Objective] To extract talents’ knowledge structure automatically. [Methods] We built an online knowledge structure extraction system based on Web information retrieval, webpage analysis, word segmentation and semantic Web technologies. [Results] We examined the usability of the new system. For course recognition, the overall precision rate was more than 95%. For semi-structured files, the recall rate was above 95%. For some non-structured files, the reacall rate was below 90%. [Limitations] The recall rate of course recognition was restricted by the content of the dictionary. [Conclusions] The proposed method meets the requirements of constructing talents’ knowledge structure and is a useful tool for related research.
[Objective] This paper aims to identify sentiment propensity accurately with the help of a new method based on dependency parsing. [Methods] First, we extracted the sentiment stems of the sentences. Second, we defined sentiment-computing rules. Finally, we calculated sentiment propensity of each sentence. [Results] The proposed method achieved an overall accuracy of 84.46%. The average precision rate and recall rate for bullish class were 82.84% and 87.14% respectively, with an F-measure of 84.94%. In the mean time, bearish class got a precision rate of 86.28%, a recall rate of 81.74% and an F-measure of 83.95%. [Limitations] The proposed method did not consider the relevance among clauses. [Conclusions] The dependency parsing can effectively improve the accuracy of sentiment analysis of textual message from financial forum.
[Objective] This paper aims to improve traditionlal recommendation method and quality of E-Learning enviroment, which used attributes and access orders of resources in learning tree to predict learner’s rate. The collaborative filtering recommendation was then carried out through similar learner clustring. [Methods] First, “attributes of resources”“resource access order” “learning frequency and time” were standardized to construct users’ learning tree and then predict resouces rating. Second, learner’s similarity was calculated through Pearson and Cosine function respectivly based on predicted ratings. Third, K-means clustering method was used to group similar learners to establish collaborative filteing system for online E-learing. [Results] Compared with traditional collaborative filtering method, F-measure experimental result of the proposed method was 8.22% higher than the singular value decomposition CF and was 3.75% higher than the average score forecast CF. [Limitations] The proposed method was only tested on the dataset from one online learing platform with 52,456 students’ learning records and access logs. More research is needed to examine the method with other data sets. [Conclusions] The proposed collaborative filtering recommendation system does not rely on learners’ ratings and considers the influence of learners’ interest changes. It could help us deal with the starting and expanding issues.
[Objective] This study aims to identify relationship among authors of papers with similar contents but different keywords, and then tries to add more sematic factors to the co-occurrence analysis. [Methods] We proposed a method to gauge the similarity of research interests based on the keywords semantic network system. First, all keywords were represented as word vectors and translated into low dismension distribution with the help of neural network language—word2vec model. Second, we calculated the semantic association of keywords to build up a semantic network. Finally, we adopted the Jensen-Shannon distance method to measure the similarity of research interests. [Results] The proposed approach can accurately identify the similarities of co-occurrence and non co-occurrence terms and then effectively predict potential cooperation among authors. [Limitations] The amount and accuracy of training materials need to be increased. At present, we could only find potential cooperation between two authors. More research is needed to explore the possibilities of cooperation among multi-authors. [Conclusions] The proposed method could help to improve the performance of traditional co-occurrence analysis.
[Objective] This study aims to help an organization automatically download its empolyees’ open access papers from iSwitch, and then import these articles to the institutional repository. [Methods] We first synchronized data from iSwitch through timing task scheduling based on FTP protocol. Second, we parsed files and saved metadata to the database in advance. Some functions, such as import process and data management, as well as audit, were also provided. [Results] Papers could be automatically synchronized from iSwitch and then imported to the institutional repository by the system administrator. We have successfully analyzed and imported more than 60, 000 items from Web of Science. [Limitations] The accuracy and timeliness of the service by iSwitch need to be improved. The data import function of the institutional repository should also be optimized for better services. [Conclusions] The high quality institutional repositories built on iSwitch, which significantly relieve burden of researchers and system administors, should be promoted.
[Objective] This paper aims to improve the efficiency of patent value analysis and provide accurate and reliable information to the enterprises. [Methods] We utilized ACO and scientific evaluation system to analyze the value of enterprise patents systematically and then compared our results with expert’s evaluations. [Results] The discrepency rate between results of the proposed system and those by the experts’ judegments was less than 10%. The general accuracy rate of our system was more than 86%, which was 10 times higher than traditional ones. [Limitations] The new system works well for enterprises with a large number of patents. However, its performs regarding enterprises with fewer patents needs to be improved. [Conclusions] The proposed system can analyze the value of enterprise patents effectively and effeciently.
[Objective] This study aims to improve library services with the help of WeChat platform which helps readers retrieve data quickly and increase users’ loyalty to mobile library services. [Context] Most libraries’ WeChat platforms did not provide automatic or real-time services, which was also limited to human involved interventions. [Methods] The WeChat service platform was built with Apache Tomcat + JSP + MySQL architecture, WeChat API and library business systems API. [Results] The new system’s features include reader authentication, reservation of mobile devices, millennium data exchange and self-service FAQ. [Conclusions] The new WeChat library service platform could be further improved, which could provide some practical suggestion to other libraries.