[Objective] This paper explores the development of social Question and Answer studies. [Coverage] We used Google Scholar and CNKI to search literatures with the keywords “Social Q&A”. We then obtained a total of 77 representative literatures on social Q&A in conjunction with topic screening, intensive reading and retrospective method. [Methods] First, we introduced the development and early research on social Q&A. Then, we surveyed the latest social Q&A studies. [Results] At present, the researches on social Q&A focuses on four aspects, including questions, answers, users and platforms. [Limitations] More research is needed to thoroughly discuss each research’s topic. [Conclusions] Based on the current research, we offer some suggestions on future social Q&A studies from the perspectives of questions, answers, users, platforms, fields and applications.
[Objective] The web-based crowdfunding has become a new channel for fund-raising, which got more and more attention from governments and investors. However, limited research has been conducted on crowdfunding. This paper reviews the latest studies on crowdfunding, and discusses its trends. [Coverage] We retrieved 157 Chinese and English papers from Web of Science and CNKI using the keywords of “Crowdfunding”, “Crowdfinancing”, “Crowdinvesting” or “P2P Lending”. [Methods] By literature metrology and data analysis methods, we introduce the definitions and classifications of crowdfunding. Then we study the factors which influence the successful campaigns from the following aspects: platform of crowdfunding, description of the projects, social relationship of the founders, geographical factors, as well as the quality signals of the projects. [Results] Results of the crowdfunding campaigns were influenced by many factors, especially the non-quality ones. There was significant difference between the investors and peoples seeking funding, which determined the prospect of each campaign. [Limitations] More research is needed to investigate the crowdfunding models [Conclusions] There are still much to be explored in crowdfunding models, such as from the psychology, behavioral science and finance perspectives.
[Objective] The paper studies the impacts of trust on social media users’ influence to detect factors affecting information dissemination, which could benefit the development of social media. [Methods] We proposed a comprehensive evaluation index based on the direct and indirect trust, as well as the local and global influence of each individual user of social media. [Results] Simulations based on SIR model showed that original message from individuals with the highest comprehensive index value could reach the largest number of users. [Limitations] The collected data was not comprehensive, which might yield biased results. [Conclusions] The proposed index could effectively measure the trust level of each individual in social media.
[Objective] This paper reveals the evolution and regional differences of E-commerce policies for rural poverty reduction from 2008 to 2017. [Methods] First, we used the ToT (Topic over Time) model to investigate the probability distributions of time-topics and topics-words related to E-commerce policies for rural poverty reduction. Then, we analyzed the evolution of the policy contents by calculating the average intensity of topics in each year and extracted the top n topic words with the highest probabilities. Third, we divided the data from each province into the eastern, central and western regions, and then analyzed the regional differences of policies according to the probability distribution of topics and words. [Results] E-commerce policies for rural poverty reduction had the starting, exploring and developing stages. The eastern, central and western regions have different focuses on logistics, platforms and personnel training. [Limitations] The regional differences of E-commerce policies need more fine-grained analysis. [Conclusions] Compared with the traditional word frequency counting method, the ToT model effectively reveals the policy evolution and their regional differences.
[Objective] This study tries to address the classic issues facing crowd participant identification tasks. [Methods] We proposed a recursive heuristic method to reduce the attributes, aiming to establish a new crowd participant identification system based on their abilities. Then, we built a model to locate crowd participants with the help of random forests algorithm and the proposed system. [Results] Our new method reduced the data dimension to 8 from 18, which yielded better recognition rates. [Limitations] The proposed model is simple and needs to be expanded. Data of this study was retrieved from crowdsourcing contest websites, which might have data integrity issues. [Conclusions] The modified machine learning method could help us effectively identify crowdsourcing participants.
[Objective] This paper introduces a term weighting method to classify topics of Sina Weibo posts by college students, aiming to solve the high dimension and sparsity issues. [Methods] First, we calculated the probability of a term’s falling to specific categories and then predicted the probability of a document’s category. Then, we converted the word-based features to a class-based matrix, which was classified by the support vector machine. [Results] Our new method increased the MicroF1/MacroF1values of the traditional tf, tf×idf and tf×rf methods by 7.2%/7.8%, 7.5%/7.9% and 6.4%/5.7%, respectively. [Limitations] More research is needed to explore topic classification methods other than the term weighting one in this paper. [Conclusions] The proposed method could effectively reduce the dimension of feature matrix and improve the classification efficiency for Internet public opinion studies.
[Objective] This paper tries to address the challenges facing Smart Tourism industry, such as data sparseness and cold start, with the help of collaborative recommendation technology. [Methods] First, we clustered users with the K-means algorithm and then filtered and classified them dynamically based on the combination of collaborative recommendation technology. Then, we assigned weight to the recommended types and proposed a new algorithm based on Improved Uncertain Neighbors Collaborative Filtering (IUNCF). Finally, we examined the proposed algorithm with real world tourism data of different similarity thresholds and recommended numbers. [Results] The MAE value and F-measure reached 0.243 and 0.764, which showed the effectiveness of IUNCF in accuracy and reliability. [Limitations] The IUNCF algorithm needs to be further optimized to deal with the low frequency consumption issue. We could also extend the application of this new model. [Conclusions] The proposed IUNCF algorithm could precisely recommend smart tourism products to the consumers.
[Objective] This study is to improve the effectiveness of merchandise recommendation based on temporal dynamics and sequential patterns of sales. [Methods] We developed an improved personalized recommendation algorithm for electronic commerce. First, we introduced a new similarity calculation function with time and hot coefficients. Then, we proposed an algorithm with the two-item sequential pattern, which modified the recommended list based on the sequential patterns. [Results] We examined the new method with book review data of Amazon.com from 2004-2005, and found its precision and F values were 1.89% and 0.73% higher than the collaborative filtering algorithm with adjusted cosine similarity. [Limitations] The proposed model did not examine the violations of consumers’ review scores. [Conclusions] Both the similarity function and sequential patterns can improve the effectiveness of personalized recommendation algorithms for e-commerce.
[Objective] This paper tries to solve the issues facing traditional FCM algorithm, such as randomly choosing initial cluster center, sensitive to noise, and only capable of clustering the equally distributed samples. [Methods] We proposed a new FCM clustering algorithm based on Huffman tree with dissimilarity degree matrix of high density sample sets. The new algorithm could get initial clustering centers, and then generate the membership function of the non-normalized constraint samples. [Results] We examined the proposed algorithm with man-made samples, images, and UCI datasets. The clustering accuracy and the computation time of the new algorithm were better than algorithms based on the Gauss kernel or traditional FCM. [Limitations] The $\beta $ of the sample density adjustment factor was decided by experiment or experience without theoretical supports. [Conclusions] The proposed algorithm could be used for clustering data sets with high level of noise and distributed unequally.
[Objective] This paper proposes a model to extract the names of Chinese historical events, aiming to reorganize knowledge from texts and construct the ontology for these events. [Methods] We built the proposed model with conditional random fields(CRFs) and automatically tagging technology, based on the historical texts of the Wei, Jin, Northern and Southern Dynasties. Then, we explored the influence of different Chinese characters and features on recognizing event names. [Results] We constructed the best model based on the features of characters and the surnames. The F1 value of this model was as high as 98.74%. This model was examined with two open scenarios and achieved good results. [Limitations] The size of our training corpus needs to be expanded. More research is needed to compare results of single Chinese character tags and the phrases. [Conclusions] The CRFs model could effectively identify the names of Chinese historical events under appropriate working conditions.
[Objective] This paper helps the library’s mobile app provide contents based on the users’ actual locations and their expectations, which are the key issues facing mobile service innovation. [Methods] First, we proposed the concept of information acceptance entropy based on the theory of information entropy and context entropy. Then, we constructed a generalized component distribution probability model for the information acceptance entropy with the help of entropy energy distribution theory. Finally, we examined our model with the academic libraries in China’s Liaoning, Jilin and Henan provinces. [Results] We wrote new algorithms with Matlab and used the Likert scale to evaluate the users’ perception and experience. The new model successfully calculated and simulated the information acceptance entropy of different scenes. We found that switching the scenes in time and increasing relevant contents will improve user’s experience. [Limitations] The sample size needs to be expanded to improve the accuracy of simulation. [Conclusions] The proposed model could compare and predict the multidimensional information acceptance entropy of different locations.