Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (3): 58-66    DOI: 10.11925/infotech.1003-3513.2016.03.08
Orginal Article Current Issue | Archive | Adv Search |
Retrieving Geographic Information for Micro-blog’s City Complaints
Sun He1,2(),Li Shuqin2,Lv Xueqiang1,2,Liu Kehui3,4
1Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology, Beijing 100101, China
2College of Computer, Beijing Information Science and Technology University, Beijing 100101, China
3School of Management and Economics Beijing Institute of Technology, Beijing 10081, China
4Beijing Research Center of Urban Systems Engineering, Beijing 100035, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study aims to utilize the knowledge sharing and constantly updating advantages of the Question Answering Community - Baidu Zhidao, which helps us reduce the cost of maintaining large geographical relationship resource, and find the complete location information. [Methods] First, we changed the incomplete location information to the approximate area names retrieved from Baidu Zhidao. Second, extracted each area’s features and calculated scores of related geographic entities. Finally, we constructed the feature vectors for the areas with those geographic entities, which help us identify the geographic locations of these posts. [Results] The proposed method could retrieve accurate geographic information from 92.51% of City Complaints from the Micro-blog platform. [Limitations] The proposed method could not analyze posts without any geographic location information. [Conclusions] Our study found an effective and feasible way to locate the missing geographic information.

Key wordsCity complaints of Micro-blog      Defect location entity      Question Answering Community(QAC)      Eigenvalue calculation      Integrity     
Received: 22 September 2015      Published: 12 April 2016

Cite this article:

Sun He,Li Shuqin,Lv Xueqiang,Liu Kehui. Retrieving Geographic Information for Micro-blog’s City Complaints. New Technology of Library and Information Service, 2016, 32(3): 58-66.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2016.03.08     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2016/V32/I3/58

[1] 蔡华利, 刘鲁, 李红. 基于规则推理的突发事件发生地点识别研究[J]. 情报学报, 2011, 30(2): 219-224.
[1] (Cai Huali, Liu Lu, Li Hong.Rule Reasoning-based Occurring Place Recognition for Unexpected Event[J]. Journal of the China Society for Scientific and Technical Information, 2011, 30(2): 219-224.)
[2] 李丽双, 黄德根, 陈春荣, 等. 用支持向量机进行中文地名识别的研究[J]. 小型微型计算机系统, 2005, 26(8): 1416-1419.
[2] (Li Lishuang, Huang Degen, Chen Chunrong, et al.Research on Method of Automatic Recognition of Chinese Place Names Based on Support Vector Machines[J]. Journal of Chinese Computer Systems, 2005, 26(8): 1416-1419.)
[3] 唐旭日, 陈小荷, 许超, 等. 基于篇章的中文地名识别研究[J]. 中文信息学报, 2010, 24(2): 24-32.
[3] (Tang Xuri, Chen Xiaohe, Xu Chao, et al.Discourse-Based Chinese Location Name Recognition[J]. Journal of Chinese Information Processing, 2010, 24(2): 24-32.)
[4] 杜萍, 刘勇. 基于本体的中文地名识别[J]. 西北师范大学学报: 自然科学版, 2012, 47(6): 87-93.
[4] (Du Ping, Liu Yong.Recognition of Chinese Place Names Based on Ontology[J]. Journal of Northwest Normal University: Natural Science, 2011, 47(6): 87-93.)
[5] 李诺, 张全. 利用地名用字分析的中文地名识别处理[J]. 计算机工程与应用, 2009, 45(28): 230-232.
[5] (Li Nuo, Zhang Quan.Chinese Place Name Identification with Chinese Characters Features[J]. Computer Engineering and Applications, 2009, 45(28): 230-232.)
[6] 李丽双, 党延忠, 廖文平, 等. CRF 与规则相结合的中文地名识别[J]. 大连理工大学学报, 2012, 52(2): 285-289.
[6] (Li Lishuang, Dang Yanzhong, Liao Wenping, et al.Recognition of Chinese Location Names Based on CRF and Rules[J]. Journal of Dalian University of Technology, 2012, 52(2): 285-289.)
[7] 李丽双, 黄德根, 陈春荣, 等. SVM 与规则相结合的中文地名自动识别[J]. 中文信息学报, 2006, 20(5): 51-57.
[7] (Li Lishuang, Huang Degen, Chen Chunrong, et al.Identifying Chinese Place Names Based on Support Vector Machines and Rules[J]. Journal of Chinese Information Processing, 2006, 20(5): 51-57.)
[8] 黄德根, 岳广玲, 杨元生. 基于统计的中文地名识别[J]. 中文信息学报, 2003, 17(2): 36-41.
[8] (Huang Degen, Yue Guangling, Yang Yuansheng.Identification of Chinese Place Names Based on Statistics[J]. Journal of Chinese Information Processing, 2003, 17(2): 36-41.)
[9] 钱晶, 张玥杰, 张涛. 基于最大熵的汉语人名地名识别方法研究[J]. 小型微型计算机系统, 2006, 27(9): 1761-1765.
[9] (Qian Jing, Zhang Yuejie, Zhang Tao.Research on Chinese Person Name and Location Name Recognition Based on Maximum Entropy Model[J]. Journal of Chinese Computer Systems, 2006, 27(9): 1761-1765.)
[10] 高燕, 张维维, 张艳红, 等. 最大熵模型在最长地点实体识别中的应用[J]. 广东石油化工学院学报, 2012, 22(4): 40-42.
[10] (Gao Yan, Zhang Weiwei, Zhang Yanhong, et al.Application of Maximum Entropy Model in the LLE Identification[J]. Journal of Guangdong University of Petrochemical Technology, 2012, 22(4): 40-42.)
[11] Li X W, Lv X Q, Liu K H.Automatic Recognition of Chinese Location Entity [A]. // Natural Language Processing and Chinese Computing[M]. Springer Berlin Heidelberg, 2014: 379-391.
[12] Egenhofer M J.Toward the Semantic Geospatial Web[C]. In: Proceedings of the 10th ACM International Symposium on Advances in Geographic Information System. 2002.
[13] 杜萍. 基于本体的中国行政区划地名识别与抽取研究[D]. 兰州: 兰州大学, 2011.
[13] (Du Ping.Study on the Ontology- Based Extraction of the Names of Chinese Administrative Division [D]. Lanzhou: Lanzhou University, 2011.)
[14] McCurley K S. Geospatial Mapping and Navigation of the Web [C]. In: Proceedings of the 10th International Conference on World Wide Web, Hong Kong, China. 2001: 221-229.
[15] Amitay E, Har’El N, Sivan R, et al. Web-a-Where: Geotagging Web Content [C]. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.2004: 273-280.
[16] Smith D A, Crane G.Disambiguating Geographic Names in a Historical Digital Library [A]. // Research and Advanced Technology for Digital Libraries[M]. Springer Berlin/ Heidelberg, 2001: 127-136.
[17] Overell S, Magalhaes J, Rüger S M.Place Disambiguation with Co-occurrence Models [C]. In: Proceedings of the 2006 Cross Language Evaluation Forum, Alicante, Spain. 2006.
[18] Overell S E, Rüger S M.Using Co-occurrence Models for Placename Disambiguation[J]. International Journal of Geographical Information Science, 2008, 22(3): 265-287.
[19] NLPIR汉语分词系统[EB/OL]. [2015-11-10]. .
[19] (NLPIR Chinese Word Segmentation System [EB/OL]. [2015-11-10].
[20] 中国人知识搜索行为研究报告[R/OL]. [2015-11-10]. .
[20] (Report of Knowledge Search Behavior of Chinese User[R/OL]. [2015-11-10].
[21] 推荐答案[EB/OL]. [2015-08-20]. .
[21] (Answer [EB/OL]. [2015-08-20].
[22] 李学伟, 吕学强, 董志安, 等.利用URL-Key进行查询分类[J]. 北京大学学报: 自然科学版, 2015, 51(2): 220-226.
[22] (Li Xuewei, Lv Xueqiang, Dong Zhian, et al.Query Classification by Using URL-Key[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2015, 51(2): 220-226.)
[1] Liu Lan,Wu Zhenxin,Zhang Zhixiong,Xu Lin. Study on Harvest Strategy in Web Archive[J]. 现代图书情报技术, 2009, 3(1): 10-15.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn