Retrieving Geographic Information for Micro-blog’s City Complaints
Sun He1,2(),Li Shuqin2,Lv Xueqiang1,2,Liu Kehui3,4
1Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology, Beijing 100101, China 2College of Computer, Beijing Information Science and Technology University, Beijing 100101, China 3School of Management and Economics Beijing Institute of Technology, Beijing 10081, China 4Beijing Research Center of Urban Systems Engineering, Beijing 100035, China
[Objective] This study aims to utilize the knowledge sharing and constantly updating advantages of the Question Answering Community - Baidu Zhidao, which helps us reduce the cost of maintaining large geographical relationship resource, and find the complete location information. [Methods] First, we changed the incomplete location information to the approximate area names retrieved from Baidu Zhidao. Second, extracted each area’s features and calculated scores of related geographic entities. Finally, we constructed the feature vectors for the areas with those geographic entities, which help us identify the geographic locations of these posts. [Results] The proposed method could retrieve accurate geographic information from 92.51% of City Complaints from the Micro-blog platform. [Limitations] The proposed method could not analyze posts without any geographic location information. [Conclusions] Our study found an effective and feasible way to locate the missing geographic information.
孙赫,李淑琴,吕学强,刘克会. 微博城市投诉文本中地理位置实体的完整性研究*[J]. 现代图书情报技术, 2016, 32(3): 58-66.
Sun He,Li Shuqin,Lv Xueqiang,Liu Kehui. Retrieving Geographic Information for Micro-blog’s City Complaints. New Technology of Library and Information Service, 2016, 32(3): 58-66.
(Cai Huali, Liu Lu, Li Hong.Rule Reasoning-based Occurring Place Recognition for Unexpected Event[J]. Journal of the China Society for Scientific and Technical Information, 2011, 30(2): 219-224.)
(Li Lishuang, Huang Degen, Chen Chunrong, et al.Research on Method of Automatic Recognition of Chinese Place Names Based on Support Vector Machines[J]. Journal of Chinese Computer Systems, 2005, 26(8): 1416-1419.)
(Tang Xuri, Chen Xiaohe, Xu Chao, et al.Discourse-Based Chinese Location Name Recognition[J]. Journal of Chinese Information Processing, 2010, 24(2): 24-32.)
(Du Ping, Liu Yong.Recognition of Chinese Place Names Based on Ontology[J]. Journal of Northwest Normal University: Natural Science, 2011, 47(6): 87-93.)
(Li Nuo, Zhang Quan.Chinese Place Name Identification with Chinese Characters Features[J]. Computer Engineering and Applications, 2009, 45(28): 230-232.)
(Li Lishuang, Dang Yanzhong, Liao Wenping, et al.Recognition of Chinese Location Names Based on CRF and Rules[J]. Journal of Dalian University of Technology, 2012, 52(2): 285-289.)
(Li Lishuang, Huang Degen, Chen Chunrong, et al.Identifying Chinese Place Names Based on Support Vector Machines and Rules[J]. Journal of Chinese Information Processing, 2006, 20(5): 51-57.)
(Huang Degen, Yue Guangling, Yang Yuansheng.Identification of Chinese Place Names Based on Statistics[J]. Journal of Chinese Information Processing, 2003, 17(2): 36-41.)
(Qian Jing, Zhang Yuejie, Zhang Tao.Research on Chinese Person Name and Location Name Recognition Based on Maximum Entropy Model[J]. Journal of Chinese Computer Systems, 2006, 27(9): 1761-1765.)
(Gao Yan, Zhang Weiwei, Zhang Yanhong, et al.Application of Maximum Entropy Model in the LLE Identification[J]. Journal of Guangdong University of Petrochemical Technology, 2012, 22(4): 40-42.)
[11]
Li X W, Lv X Q, Liu K H.Automatic Recognition of Chinese Location Entity [A]. // Natural Language Processing and Chinese Computing[M]. Springer Berlin Heidelberg, 2014: 379-391.
[12]
Egenhofer M J.Toward the Semantic Geospatial Web[C]. In: Proceedings of the 10th ACM International Symposium on Advances in Geographic Information System. 2002.
[13]
杜萍. 基于本体的中国行政区划地名识别与抽取研究[D]. 兰州: 兰州大学, 2011.
[13]
(Du Ping.Study on the Ontology- Based Extraction of the Names of Chinese Administrative Division [D]. Lanzhou: Lanzhou University, 2011.)
[14]
McCurley K S. Geospatial Mapping and Navigation of the Web [C]. In: Proceedings of the 10th International Conference on World Wide Web, Hong Kong, China. 2001: 221-229.
[15]
Amitay E, Har’El N, Sivan R, et al. Web-a-Where: Geotagging Web Content [C]. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.2004: 273-280.
[16]
Smith D A, Crane G.Disambiguating Geographic Names in a Historical Digital Library [A]. // Research and Advanced Technology for Digital Libraries[M]. Springer Berlin/ Heidelberg, 2001: 127-136.
[17]
Overell S, Magalhaes J, Rüger S M.Place Disambiguation with Co-occurrence Models [C]. In: Proceedings of the 2006 Cross Language Evaluation Forum, Alicante, Spain. 2006.
[18]
Overell S E, Rüger S M.Using Co-occurrence Models for Placename Disambiguation[J]. International Journal of Geographical Information Science, 2008, 22(3): 265-287.
[19]
NLPIR汉语分词系统[EB/OL]. [2015-11-10]. .
[19]
(NLPIR Chinese Word Segmentation System [EB/OL]. [2015-11-10].
[20]
中国人知识搜索行为研究报告[R/OL]. [2015-11-10]. .
[20]
(Report of Knowledge Search Behavior of Chinese User[R/OL]. [2015-11-10].
(Li Xuewei, Lv Xueqiang, Dong Zhian, et al.Query Classification by Using URL-Key[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2015, 51(2): 220-226.)