|
|
Retrieving Mathematical Expressions Based on Hesitant Fuzzy Weight |
Xu Yicong,Tian Xuedong(),Li Xinfu,Yang Fang,Shi Qingxuan |
School of Cyber Security and Computer, Hebei University, Baoding 071002, China |
|
|
Abstract [Objective] This paper proposes a retrieval method for mathematical expressions, aiming to find items matching the queries from a large collection of math expressions.[Methods] Firstly, we extracted characteristic subformulas of each single mathematical expression and introduced the theory of hesitant fuzzy sets(HFSs) to compute their weights. Secondly, we added the weight values of all subformulas belonging to the same expression as the similarity scores between the index and query. Finally, we ranked retrieved results with the similarity scores.[Results] The proposed method had higher retrieval efficiency and better results than traditional methods, with the highest NDCG value reached 0.88.[Limitations] Our method did not fully address the semantics of mathematical expressions.[Conclusions] The proposed method could retrieve the needed mathematical expressions more accurately.
|
Received: 02 December 2019
Published: 25 July 2020
|
|
Corresponding Authors:
Tian Xuedong
E-mail: xuedong_tian@126.com
|
[1] |
Lin X Y, Gao L C, Hu X, et al. A Mathematics Retrieval System for Formulae in Layout Presentations[C] // Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. 2014: 697-706.
|
[2] |
Mišutka J, Galamboš L. System Description: EgoMath2 as a Tool for Mathematical Searching on Wikipedia.org[C] //Proceedings of the 10th International Conference on Intelligent Computer Mathematics. 2011: 307-309.
|
[3] |
Sojka P, Líška M. Indexing and Searching Mathematics in Digital Libraries[C] // Proceedings of the 10th International Conference on Intelligent Computer Mathematics. 2011: 228-243.
|
[4] |
Hambasan R, Kohlhase M, Prodescu C C. MathWebSearch at NTCIR-11[C] //Proceedings of the 11th NTCIR Conference. 2014: 114-119.
|
[5] |
周南, 田学东. LaTeX数学表达式解析与索引方法[J]. 计算机应用, 2016,36(3):833-836, 842.
|
[5] |
( Zhou Nan, Tian Xuedong. Analyzing and Indexing Method on LaTeX Formulae[J]. Journal of Computer Applications, 2016,36(3):833-836, 842.)
|
[6] |
周南. 基于层次结构特征的数学表达式检索模型[D]. 保定: 河北大学, 2016.
|
[6] |
( Zhou Nan. A Retrieval Model of Mathematical Expressions Based on Hierarchical Structures of Formulae[D]. Baoding: Hebei University, 2016.)
|
[7] |
Hu X, Gao L C, Lin X Y, et al. WikiMirs: A Mathematical Information Retrieval System for Wikipedia[C] //Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries. 2013: 11-20.
|
[8] |
Wang Y H, Gao L C, Wang S M, et al. WikiMirs 3.0: A Hybrid MIR System Based on the Context, Structure and Importance of Formulae in a Document[C] //Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries. 2015: 173-182.
|
[9] |
Stalnaker D, Zanibbi R. Math Expression Retrieval Using an Inverted Index over Symbol Pairs[C] //Proceedings of SPIE-IS&T Electronic Imaging. 2015,9402:940207.
|
[10] |
Xu Y X, Su W, Cheng M, et al. N-gram Index Structure Study for Semantic Based Mathematical Formula[C] // Proceedings of the 10th International Conference on Computational Intelligence and Security. 2014: 293-298.
|
[11] |
王小龙. 基于本体的数学表达式检索技术研究[D]. 重庆: 重庆大学, 2014.
|
[11] |
( Wang Xiaolong. Research on Ontology-Based Mathematical Expression Retrieval Technologies[D]. Chongqing: Chongqing University, 2014.)
|
[12] |
Yang S Q, Tian X D. A Maintenance Algorithm of FDS Based Mathematical Expression Index[C] // Proceedings of the 2014 International Conference on Machine Learning and Cybernetics. 2014: 888-892.
|
[13] |
徐建民, 许彩云. 基于文本和公式的科技文档相似度计算[J]. 数据分析与知识发现, 2018,2(10):103-109.
|
[13] |
( Xu Jianmin, Xu Caiyun. Computing Similarity of Sci-Tech Documents Based on Texts and Formulas[J]. Data Analysis and Knowledge Discovery, 2018,2(10):103-109.)
|
[14] |
李夏梦, 潘广贞. 基于消息摘要算法第五版和IDEA的混合加密算法[J]. 科学技术与工程, 2017,17(9):233-238.
|
[14] |
( Li Xiameng, Pan Guangzhen. Message-digest Algorithm 5-IDEA Based Hybrid Encryption Algorithm[J]. Science Technology and Engineering, 2017,17(9):233-238.)
|
[15] |
Torra V. Hesitant Fuzzy Sets[J]. International Journal of Intelligent Systems, 2010,25(6):529-539.
|
[16] |
Torra V, Narukawa Y. On Hesitant Fuzzy Sets and Decision[C] //Proceedings of the 2009 IEEE International Conference on Fuzzy Systems. 2009: 1378-1382.
|
[17] |
Xu Z S, Xia M M. Distance and Similarity Measures for Hesitant Fuzzy Sets[J]. Information Sciences, 2011,181(11):2128-2138.
|
[18] |
张凯歌. 基于犹豫模糊集的数学检索结果排序研究[D]. 保定: 河北大学, 2017.
|
[18] |
( Zhang Kaige. Research on the Ranking of Mathematical Retrieval Results Based on Hesitant Fuzzy Sets[D]. Baoding: Hebei University, 2017.)
|
[19] |
景珂. 网络数学搜索中的数学查询语言与索引的研究[D]. 兰州: 兰州大学, 2009.
|
[19] |
( Jing Ke. Research on Math Query Language and Index in Web-based Math Search[D]. Lanzhou: Lanzhou University, 2009.)
|
[20] |
徐月霞. 面向语义的数学公式N-grams索引结构研究[D]. 兰州: 兰州大学, 2015.
|
[20] |
( Xu Yuexia. N-gram Index Structure for Semantic Based Mathematical Formulas[D]. Lanzhou: Lanzhou University, 2015.)
|
[21] |
Jin X B, Geng G G, Xie G S, et al. Approximately Optimizing NDCG Using Pair-wise Loss[J]. Information Sciences, 2018,453:50-65.
doi: 10.1016/j.ins.2018.04.033
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|