[Objective] This paper tries to identify the relationship between pollution sources and cancer cases, aiming to address the issues of discovering too many non-pertnient patterns by method using spatial co-location patterns. [Methods] First, we combined the properties of Voronoi diagram and the star instance model. Then, we defined the proximity relationship between spatial instances and the concept of spatial ordered pair patterns. Third, we decided the prevalence and the influence of the spatial ordered pair patterns based on the distance attenuation and the influence superposition effects. Finally, we proposed a basic algorithm and an optimization algorithm to examine the spatial ordered pair patterns.[Results] The proposed algorithms revealed more pertinent relationship which cannot be identified by the traditional algorithms. And the total number of results was much less than those of the traditional algorithms. Compared with the basic algorithm, the pruning rate of the optimization algorithm surpassed 80%. The larger the data set, the better the results. [Limitations] The default data are all point-spatial objects, while the extended spatial objects merit more studies. [Conclusions] The spatial ordered pair patterns could effectively identify the relationship between pollution sources and cancer cases.
Bray F, Ferlay J, Soerjomataram I, et al. Global Cancer Statistics 2018: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries[J]. CA: A Cancer Journal for Clinicians, 2018,68(6):394-424.
doi: 10.3322/caac.v68.6
[2]
Chen W, Zheng R, Baade P D, et al. Cancer Statistics in China, 2015[J]. CA: A Cancer Journal for Clinicians, 2016,66(2):115-132.
doi: 10.3322/caac.21338
( Yu Yanqin, Qiao Youlin. Attributable Risk Factors of Tumor Environmental, China[J]. Modern Preventive Medicine, 2019,46(1):162-165, 175.)
[4]
Huang Y, Shekhar S, Xiong H. Discovering Colocation Patterns from Spatial Data Sets: A General Approach[J]. IEEE Transactions on Knowledge and Data Engineering, 2004,16(12):1472-1485.
doi: 10.1109/TKDE.2004.90
[5]
Yoo J S, Shekhar S, Smith J, et al. A Partial Join Approach for Mining Co-location Patterns[C]//Proceedings of the 12th Annual ACM International Workshop on Geographic Information Systems (GIS), Washington. New York: ACM, 2004: 241-249.
[6]
Yoo J S, Shekhar S, Celik M. A Join-Less Approach for Co-location Pattern Mining: A Summary of Results[C]//Proceedings of the 5th IEEE International Conference on Data Mining. IEEE, 2005: 813-816.
[7]
Wang L, Bao Y, Lu Z. Efficient Discovery of Spatial Co-location Patterns Using the iCPI-tree[J]. The Open Information Systems Journal, 2009,3(2):69-80.
doi: 10.2174/1874133900903020069
[8]
Wang L, Bao X, Chen H, et al. Effective Lossless Condensed Representation and Discovery of Spatial Co-location Patterns[J]. Information Sciences, 2018, 436-437:197-213.
doi: 10.1016/j.ins.2018.01.011
[9]
Wang L, Bao X, Zhou L. Redundancy Reduction for Prevalent Co-location Patterns[J]. IEEE Transactions on Knowledge and Data Engineering, 2018,30(1):142-155.
doi: 10.1109/TKDE.69
[10]
Tobler W R. A Computer Movie Simulating Urban Growth in the Detroit Region[J]. Economic Geography, 2016,46(1970):234-240.
doi: 10.2307/143141
( Hu Xin, Wang Lizhen, Zhou Lihua, et al. Mining Spatial Maximal Co-location Patterns[J]. Journal of Frontiers of Computer Science and Technology, 2014,8(2):150-160.)
doi: 10.3778/j.issn.1673-9418.1306010
( Wang Guangyao, Wang Lizhen, Yang Peizhong, et al. Minimal Negative Co-location Patterns and Effective Mining Algorithm[J]. Journal of Frontiers of Computer and Technology, 2021,15(2):366-378.)
[13]
Chan H K, Cheng L, Da Y, et al. Fraction-Score: A New Support Measure for Co-location Pattern Mining[C]//Proceedings of the 2019 IEEE 35th International Conference on Data Engineering. IEEE, 2019: 1514-1525.
[14]
Wang L, Han J, Chen H, et al. Top-k Probabilistic Prevalent Co-location Mining in Spatially Uncertain Data Sets[J]. Frontiers of Computer Science, 2016,10(3):488-503.
doi: 10.1007/s11704-015-4196-9
[15]
Wang L, Chen H, Zhao L, et al. Efficiently Mining Co-location Rules on Interval Data[C]//Proceedings of the 6th International Conference on Advanced Data Mining and Applications. Berlin: Springer, 2010: 477-488.
[16]
Ouyang Z, Wang L, Wu P. Spatial Co-location Pattern Discovery from Fuzzy Objects[J]. International Journal on Artificial Intelligence Tools, 2017,26(2):1-20.
[17]
Yang P, Wang L, Wang X, et al. An Effective Approach on Mining Co-location Patterns from Spatial Databases with Rare Features[C]//Proceedings of the 20th IEEE International Conference on Mobile Data Management. IEEE, 2019: 53-62.
( Wang Xiaoxuan, Wang Lizhen, Chen Hongmei, et al. Mining Spatial High Utility Co-location Patterns Based on Feature Utility Ratio[J]. Chinese Journal of Computers, 2019,42(8):1721-1738.)
[19]
Ge Y, Yao Z, Li H. Computing Co-location Patterns in Spatial Data with Extended Objects: A Scalable Buffer-based Approach[J]. IEEE Transactions on Knowledge and Data Engineering, 2019.
doi: 10.1109/TKDE.2012.149
pmid: 24693210
[20]
Tran V, Wang L. Delaunay Triangulation-based Spatial Co-location Pattern Mining Without Distance Thresholds[J]. Statistical Analysis and Data Mining: The ASA Data Science Journal, 2020,13(3):282-304.
doi: 10.1002/sam.v13.3
[21]
Qian F, He Q, Chiew K, et al. Spatial Co-location Pattern Discovery Without Thresholds[J]. Knowledge and Information Systems, 2012,33(2):419-445.
doi: 10.1007/s10115-012-0506-9
[22]
Qian F, Chiew K, He Q, et al. Mining Regional Co-location Patterns with kNNG[J]. Journal of Intelligent Information Systems, 2014,42(3):485-505.
doi: 10.1007/s10844-013-0280-5
[23]
Li J, Adilmagambetov A, Mohomed Jabbar M S, et al. On Discovering Co-location Patterns in Datasets: A Case Study of Pollutants and Child Cancers[J]. Geoinformatica, 2014,20(4):651-692.
doi: 10.1007/s10707-016-0254-1
( Chu Chuanxin, Wang Lizhen, Zhou Lihua, et al. Mining the Fuzzy Relationship Between Malignant Tumors and Industrial Pollution[J]. Journal of Frontiers of Computer Science and Technology, 2020,14(12):2061-2071.)