Multi-Truth Discovery Method Based on Attribute Fusion
Yang Haolin1,Dong Yongquan1,2(),Chen Huafeng1,Zhang Guoxi1
1School of Computer Science and Technology, Jiangsu Normal University, Xuzhou 221008, China 2Xuzhou Engineering Research Center of Cloud Computing, Xuzhou 221100, China
[Objective] This paper adds influence of auxiliary attributes to the existing models for multi-truth discovery, aiming to improve their F1 values. [Methods] First, we used the auxiliary attributes to calculate the source expertise and consensus degree. Then, we combined the activity degree of multi-truth attribute values to get the degree of support from the source for the conflicting data. Third, we called the existing truth discovery methods to obtain the pseudo tags of the truth. Finally, we used the neural network to capture the complex relationship between the sources and the conflicting data, and identified all truth. [Results] Compared with the sub-optimal model, our method improved the F1 value by 2.25% on the book dataset and by 5.42% on the movie dataset. [Limitations] The proposed method included auxiliary attributes reflecting object features, and more research is needed to explore the impacts of other auxiliary attributes on multi-truth discovery. [Conclusions] The proposed method could effectively discover multi-truth.
(Li Jianzhong, Wang Hongzhi, Gao Hong. State-of-the-Art of Research on Big Data Usability[J]. Journal of Software, 2016, 27(7): 1605-1625.)
[3]
Bleiholder J, Naumann F. Data Fusion[J]. ACM Computing Surveys, 2009, 41(1): 1-41.
[4]
Dong X L, Naumann F. Data Fusion- Resolving Data Conflicts for Integration[J]. Proceedings of the VLDB Endowment, 2009, 2(2): 1654-1655.
doi: 10.14778/1687553.1687620
[5]
Li Y L, Gao J, Meng C S, et al. A Survey on Truth Discovery[J]. ACM SIGKDD Explorations Newsletter, 2016, 17(2): 1-16.
[6]
Yin X X, Han J W, Yu P S. Truth Discovery with Multiple Conflicting Information Providers on the Web[J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(6): 796-808.
doi: 10.1109/TKDE.2007.190745
[7]
Dong X L, Berti-Equille L, Srivastava D. Truth Discovery and Copying Detection in a Dynamic World[J]. Proceedings of the VLDB Endowment, 2009, 2(1): 562-573.
doi: 10.14778/1687627.1687691
[8]
Dong X L, Berti-Équille L, Srivastava D. Integrating Conflicting Data: The Role of Source Dependence[J]. Proceedings of the VLDB Endowment, 2009, 2(1): 550-561.
doi: 10.14778/1687627.1687690
[9]
Galland A, Abiteboul S, Marian A, et al. Corroborating Information from Disagreeing Views[C]// Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 2010: 131-140.
[10]
Qi G J, Aggarwal C C, Han J, et al. Mining Collective Intelligence in Diverse Groups[C]// Proceedings of the 22nd International Conference on World Wide Web. 2013: 1041-1052.
[11]
Zhao B, Rubinstein B I P, Gemmell J, et al. A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration[J]. Proceedings of the VLDB Endowment, 2012, 5(6): 550-561.
doi: 10.14778/2168651.2168656
[12]
Zhao B, Han J W. A Probabilistic Model for Estimating Real-Valued Truth from Conflicting Sources[C]// Proceedings of the 10th International Workshop on Quality in Databases, in Conjunction with VLDB 2012. 2012.
[13]
Wang X Z, Sheng Q Z, Fang X S, et al. An Integrated Bayesian Approach for Effective Multi-Truth Discovery[C]// Proceedings of the 24th ACM International Conference on Information and Knowledge Management. 2015: 493-502.
(Ma Ruxia, Meng Xiaofeng. Truth Discovery Based Credibility of Data Categories on Data Sources[J]. Journal of Computer Research and Development, 2015, 52(9): 1931-1940.)
(Ma Ruxia, Meng Xiaofeng, Wang Lu, et al. MTruths: An Approach of Multiple Truths Finding from Web Information[J]. Journal of Computer Research and Development, 2016, 53(12): 2858-2866.)
[16]
Canalle G K, Salgado A C, Loscio B F. A Survey on Data Fusion: What for? in What Form? What is Next?[J]. Journal of Intelligent Information Systems, 2021, 57(1): 25-50.
doi: 10.1007/s10844-020-00627-4
(Lu Jing, Hu Cheng, Liu Cong. Research on Multi-Truth Discovery Using Attribute Set Correlation and Source Error[J]. Journal of Chinese Computer Systems, 2019, 40(3): 601-605.)
[18]
Chen H F, Dong Y Q, Gu Q, et al. An End-to-End Deep Neural Network for Truth Discovery[C]// Proceedings of the International Conference on Web Information Systems and Applications. 2020: 377-387.
[19]
Fang X S, Sheng Q Z, Wang X Z, et al. SmartVote: A Full-Fledged Graph-Based Model for Multi-Valued Truth Discovery[J]. World Wide Web, 2019, 22(4): 1855-1885.
doi: 10.1007/s11280-018-0629-3
[20]
Lin X L, Chen L. Domain-Aware Multi-Truth Discovery from Conflicting Sources[J]. Proceedings of the VLDB Endowment, 2018, 11(5): 635-647.
doi: 10.1145/3187009.3177739