[Objective] This study tries to reconstruct tourists’ itineraries based on their travel notes and scenic information.[Methods] Firstly, we combined the TF-IDF and Word2Vec models. Then, we built a recognition method for named entities based on text similarity, which helped us identify scenic spots from travel notes. Finally, we proposed a model based on Markov property, prior knowledge and spatial characteristics to reconstruct tour itineraries.[Results] The recall, precision and F1 index values of the proposed method were 90.72%, 89.65%, and 0.9018, which were all better than those of the methods based on Conditional Random Field. The degree of similarity between the reconstructed routes and the actual ones was 83.27%.[Limitations] The completeness of scenic information might impact the performance of our model.[Conclusions] The proposed method can automatically identify scenic spots, and reconstruct travel itinerary effectively.
( Zhang Xiaoyan, Wang Ting, Chen Huowang . Research on Named Entity Recognition[J]. Computer Science, 2005,32(4):44-48.)
Phithakkitnukoon S, Horanont T, Witayangkurn A , et al. Understanding Tourist Behavior Using Large-Scale Mobile Sensing Approach: A Case Study of Mobile Phone Users in Japan[J]. Pervasive and Mobile Computing, 2015,18:18-39.
Budig B, Van Dijk T C . Journeys of the Past: A Hidden Markov Approach to Georeferencing Historical Itineraries[C]// Proceedings of the 11th Workshop on Geographic Information Retrieval. ACM, 2017: Article No. 7.
Blank D, Henrich A . Geocoding Place Names from Historic Route Descriptions[C]// Proceedings of the 9th Workshop on Geographic Information Retrieval. ACM, 2015: Article No. 9.
Blank D, Henrich A . A Depth-First Branch-and-Bound Algorithm for Geocoding Historic Itinerary Tables[C]// Proceedings of the 10th Workshop on Geographic Information Retrieval. ACM, 2016: Article No. 3.
Adelfio M D, Samet H . Itinerary Retrieval: Travelers, Like Traveling Salesmen, Prefer Efficient Routes[C]// Proceedings of the 8th Workshop on Geographic Information Retrieval. ACM, 2014: Article No. 1.
Zhou J, Li B, Chen G . Automatically Building Large-Scale Named Entity Recognition Corpora from Chinese Wikipedia[J]. Frontiers of Information Technology & Electronic Engineering, 2015,16(11):940-956.
( Zhang Yuejie, Xu Zhiting, Xue Xiangyang . Fusion of Multiple Features for Chinese Named Entity Recognition Based on Maximum Entropy Model[J]. Journal of Computer Research and Development, 2008,45(6):1004-1010.)
( Zhang Yongfu, Li Zhihong, Li Junjun , et al. A Named Entity Recognition Method for Environmental Science Based on Natural Language Processing[J]. Science and Technology Innovation Herald, 2017,14(21):120-121.)
Southall H, Mostern R, Berman M L . On Historical Gazetteers[J]. International Journal of Humanities and Arts Computing, 2011,5(2):127-145.
Jordan P . Placing Names: Enriching and Integrating Gazetteers[J]. The Cartographic Journal, 2017,54(4):377-379.
Melo F, Martins B . Automated Geocoding of Textual Documents: A Survey of Current Approaches[J]. Transactions in GIS, 2017,21(1):3-38.
Khan A, Vasardani M, Winter S . Extracting Spatial Information from Place Descriptions [C]// Proceedings of the 1st ACM SIGSPATIAL International Workshop on Computational Models of Place. 2013: 62-69.
Newson P, Krumm J . Hidden Markov Map Matching Through Noise and Sparseness [C]// Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2009: 336-343.
Moncla L, Gaio M, Noguerasiso J , et al. Reconstruction of Itineraries from Annotated Text with an Informed Spanning Tree Algorithm[J]. International Journal of Geographical Information Science, 2016,30(6):1137-1160.
Moncla L, Renteria-Agualimpia W, Noguerasiso J , et al. Geocoding for Texts with Fine-Grain Toponyms: An Experiment on a Geoparsed Hiking Descriptions Corpus [C]// Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2014: 183-192.
Salton G, Buckley C . Term-weighting Approaches in Automatic Text Retrieval[J]. Information Processing & Management, 1988,24(5):513-523.