[Objective] This study tries to address the classic issues facing crowd participant identification tasks. [Methods] We proposed a recursive heuristic method to reduce the attributes, aiming to establish a new crowd participant identification system based on their abilities. Then, we built a model to locate crowd participants with the help of random forests algorithm and the proposed system. [Results] Our new method reduced the data dimension to 8 from 18, which yielded better recognition rates. [Limitations] The proposed model is simple and needs to be expanded. Data of this study was retrieved from crowdsourcing contest websites, which might have data integrity issues. [Conclusions] The modified machine learning method could help us effectively identify crowdsourcing participants.
Howe J.The Rise of Crowdsourcing[J]. Convergence Culture Where Old & New Media Collide, 2006, 14(14): 1-5.
[2]
Pénin J, Burger-Helmchen T.Crowdsourcing of Inventive Activities: Definition and Limits[J]. International Journal of Innovation & Sustainable Development, 2011, 5(2/3): 246-263.
doi: 10.1504/IJISD.2011.043068
[3]
Barnes S A, Green A, de Hoyos M. Crowdsourcing and Work: Individual Factors and Circumstances Influencing Employability[J]. New Technology Work & Employment, 2015, 30(1): 16-31.
(Zheng Haichao, Hou Wenhua.A Research Review of Online Innovation Contest[J]. Science of Science and Management of S.&.T, 2011, 32(1): 82-88.)
[5]
Frey K, Lüthje C, Haag S.Whom Should Firms Attract to Open Innovation Platforms? The Role of Knowledge Diversity and Motivation[J]. Long Range Planning, 2011, 44(5): 397-420.
doi: 10.1016/j.lrp.2011.09.006
[6]
Erickson L B, Petrick I, Trauth E M.Hanging with the Right Crowd: Matching Crowdsourcing Need to Crowd Characteristics[R]. ProQuest LLC, 2012: 77-85.
[7]
Geiger D, Schader M.Personalized Task Recommendation in Crowdsourcing Information Systems — Current State of the Art[J]. Decision Support Systems, 2014, 65(C): 3-16.
doi: 10.1016/j.dss.2014.05.007
[8]
Tarasov A, Delany S J, Namee B M.Dynamic Estimation of Worker Reliability in Crowdsourcing for Regression Tasks: Making it Work[J]. Expert Systems with Applications, 2014, 41(14): 6190-6210.
doi: 10.1016/j.eswa.2014.04.012
[9]
Zhao Y C, Zhu Q.Conceptualizing Task Affordance in Online Crowdsourcing Context[J]. Online Information Review, 2016, 40(7): 938-958.
doi: 10.1108/OIR-06-2015-0192
[10]
Ye B, Wang Y, Liu L.Crowd Trust: A Context-Aware Trust Model for Worker Selection in Crowdsourcing Environments[C]//Proceedings of IEEE International Conference on Web Services. IEEE, 2015: 121-128.
(Lv Yingjie, Zhang Pengzhu, Liu Jingfang.Task-oriented Talent Selection in Crowdsourcing[J]. Journal of Systems & Management, 2013, 22(1): 60-66.)
doi: 10.3969/j.issn.1005-2542.2013.01.009
(Meng Qingliang, Guo Xinxin.Research on Identification of Key Knowledge Source in Crowdsourcing Innovation Based on BP Neural Network[J]. Science of Science and Management of S.&.T, 2017, 38(3): 139-148. )
[13]
Idris A, Khan A, Lee Y S.Intelligent Churn Prediction in Telecom: Employing mRMR Feature Selection and RotBoost Based Ensemble Classification[J]. Applied Intelligence, 2013, 39(3): 659-672.
doi: 10.1007/s10489-013-0440-x
[14]
Mesleh A M A. Chi Square Feature Extraction Based SVMs Arabic Language Text Categorization System[J]. Journal of Computer Science, 2007, 3(6): 430-435.
doi: 10.3844/jcssp.2007.430.435
[15]
Maldonado S, Weber R.A Wrapper Method for Feature Selection Using Support Vector Machines[J]. Information Sciences, 2009, 179(13): 2208-2217.
doi: 10.1016/j.ins.2009.02.014
[16]
Wang C W, You W H.Boosting-SVM: Effective Learning with Reduced Data Dimension[J]. Applied Intelligence, 2013, 39(3): 465-474.
doi: 10.1007/s10489-013-0425-9
(Wang Libo, Wang Yaoli, Chang Qing.A Review on Feature Selection for Bioinformatics[J]. Journal of Taiyuan University of Technology, 2017, 48(3): 458-468.)
doi: 10.16355/j.cnki.issn1007-9432tyut.2017.03.025
(Yao Dengju, Yang Jing, Zhan Xiaojuan.Feature Selection Algorithm Based on Random Forest[J]. Journal of Jilin University:Engineering and Technology Edition, 2014, 44(1): 137-141.)
doi: 10.13229/j.cnki.jdxbgxb201401024
[20]
Strobl C, Boulesteix A L, Kneib T, et al.Conditional Variable Importance for Random Forests[J]. BMC Bioinformatics, 2008, 9(1): 1-11.
doi: 10.1186/1471-2105-9-1
pmid: 2265676
(Qiu Yihui, Zhang Chiyu, Chen Shuixuan.Research of Patent-value Assessment Indictor System Based on Classification and Regression Tree Algorithm[J]. Journal of Xiamen University:Natural Science, 2017, 56(2): 244-251.)
doi: 10.6043/j.issn.0438-0479.201608004
[22]
Gefen D, Gefen G, Carmel E.How Project Description Length and Expected Duration Affect Bidding and Project Success in Crowdsourcing Software Development[J]. Journal of Systems & Software, 2016, 116: 75-84.
(Liu Jingfang, Zhang Pengzhu, Lv Yingjie, et al.Analysis of the Competence of Crowdsourcing Talents Using Text Mining[J]. Journal of Systems & Management, 2015, 24(3): 365-371.)