|
|
Recognition and Visual Analysis of Interdisciplinary Semantic Drift |
Li Nan,Wang Bo() |
Institute of Science and Technology Information, East China University of Science and Technology, Shanghai 200237, China |
|
|
Abstract [Objective] This paper analyzes the semantic drift of domain terms with machine learning techniques. It recognizes and visualizes interdisciplinary semantic drifts and explores their patterns and causes. [Methods] We designed a framework for identifying and visualizing the semantic drift of domain terms with deep learning algorithms. The framework combined algorithms of “SBERT model+word embedding optimization+hierarchical clustering” to identify interdisciplinary semantic drift. It also utilized Bokeh and principal component analysis to visualize the phenomenon of interdisciplinary semantic drift. [Results] The proposed framework can accurately identify interdisciplinary semantic drift, and the overall recognition accuracy (p) in the DT-Sentence dataset reached 86.15%. [Limitations] The framework needs to be verified with more disciplines’ datasets. [Conclusions] This study benefits data mining and visualization of semantic drifts. It also lays the technical foundation for semantic evolution, understanding, and modeling.
|
Received: 19 June 2022
Published: 22 March 2023
|
|
Fund:Fundamental Research Funds for the Central Universities of Ministry of Education of China(222202226002) |
Corresponding Authors:
Wang Bo,ORCID:0000-0001-5222-950X,E-mail:wang.bo.tianwen@qq.com。
|
[1] |
Yan E, Zhu Y J. Tracking Word Semantic Change in Biomedical Literature[J]. International Journal of Medical Informatics, 2018, 109: 76-86.
doi: S1386-5056(17)30418-5
pmid: 29195709
|
[2] |
李轶. 四险一金领域术语语义漂移研究[D]. 哈尔滨: 哈尔滨工程大学, 2020.
|
[2] |
(Li Yi. Research on Semantics Shifts of Terminology in Domain of Social Insurance and Housing Fund[D]. Harbin: Harbin Engineering University, 2020.)
|
[3] |
陈果, 陈晶, 肖璐. 词汇语义链:领域分析视角下的词汇语义挖掘理论框架[J]. 情报理论与实践, 2022, 45(4): 170-176, 183.
|
[3] |
(Chen Guo, Chen Jing, Xiao Lu. Lexical Semantic Chain: A Theoretical Framework for Lexical Semantic Mining in the Perspective of Domain Analysis[J]. Information Studies: Theory & Application, 2022, 45(4): 170-176, 183.)
|
[4] |
Vylomova E, Murphy S, Haslam N, et al. Evaluation of Semantic Change of Harm-Related Concepts in Psychology[C]// Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change. 2019: 29-34.
|
[5] |
李旭晖, 吴青峰. 面向事件的视频语义表示方法[J]. 图书情报工作, 2020, 64(10): 99-108.
doi: 10.13266/j.issn.0252-3116.2020.10.011
|
[5] |
(Li Xuhui, Wu Qingfeng. Research on Video Semantic Representation for Events[J]. Library and Information Service, 2020, 64(10): 99-108.)
doi: 10.13266/j.issn.0252-3116.2020.10.011
|
[6] |
曲佳彬, 欧石燕. 语义出版驱动的科学论文论证结构语义建模研究[J]. 现代情报, 2021, 41(12): 48-59.
doi: 10.3969/j.issn.1008-0821.2021.12.005
|
[6] |
(Qu Jiabin, Ou Shiyan. Semantic Modeling for Scientific Paper Argumentation Structure Driven by Sematic Publishing[J]. Journal of Modern Information, 2021, 41(12): 48-59.)
doi: 10.3969/j.issn.1008-0821.2021.12.005
|
[7] |
Jatowt A, Duh K. A Framework for Analyzing Semantic Change of Words Across Time[C]// Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries. 2014: 229-238.
|
[8] |
Hamilton W L, Leskovec J, Jurafsky D. Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2016: 1489-1501.
|
[9] |
Kanjirangat V, Mitrovic S, Antonucci A, et al. SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in BERT-Based Embedding Spaces[C]// Proceedings of the 14th International Workshops on Semantic Evaluation. 2020: 214-221.
|
[10] |
李瑛, 文旭. 从“头”认知——转喻、隐喻与一词多义现象研究[J]. 外语教学, 2006, 27(3): 1-5.
|
[10] |
(Li Ying, Wen Xu. Cognition from “Head”—— Research on Polysemy with Metonymy and Metaphor[J]. Foreign Language Education, 2006, 27(3): 1-5.)
|
[11] |
吴淑琼, 刘迪麟, 冉苒. 心理动词“想”的多义性:基于语料库的行为特征分析[J]. 外语与外语教学, 2021(5): 1-13.
|
[11] |
(Wu Shuqiong, Liu Dilin, Ran Ran. The Polysemy of the Mental Verb Xiang “Think”: A Corpus-Based Behavioral Profile Analysis[J]. Foreign Languages and Their Teaching, 2021(5): 1-13.)
|
[12] |
Rosch E H. Natural Categories[J]. Cognitive Psychology, 1973, 4(3): 328-350.
doi: 10.1016/0010-0285(73)90017-0
|
[13] |
Wang S H, Schlobach S, Klein M. Concept Drift and How to Identify It[J]. Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 2011, 9(3): 247-265.
doi: 10.1016/j.websem.2011.05.003
|
[14] |
Chen B T, Ding Y, Ma F C. Semantic Word Shifts in a Scientific Domain[J]. Scientometrics, 2018, 117(1): 211-226.
doi: 10.1007/s11192-018-2843-2
|
[15] |
王忠义, 涂悦, 夏立新. 科技文献资源中学科知识漂移研究[J]. 情报理论与实践, 2021, 44(6): 118-124.
doi: 10.16353/j.cnki.1000-7490.2021.06.017
|
[15] |
(Wang Zhongyi, Tu Yue, Xia Lixin. Research on Subject Knowledge Drift in Scientific Literature Resources[J]. Information Studies: Theory & Application, 2021, 44(6): 118-124.)
doi: 10.16353/j.cnki.1000-7490.2021.06.017
|
[16] |
潘俊, 吴宗大. 知识发现视角下词汇历时语义挖掘与可视化研究[J]. 情报学报, 2021, 40(10): 1052-1064.
|
[16] |
(Pan Jun, Wu Zongda. Diachronic Semantic Mining and Visualization of Chinese Words: A Knowledge Discovery Perspective[J]. Journal of the China Society for Scientific and Technical Information, 2021, 40(10): 1052-1064.)
|
[17] |
张瑞, 赵栋祥, 唐旭丽, 等. 知识流动视角下学术名词的跨学科迁移与发展研究[J]. 情报理论与实践, 2020, 43(1): 47-55, 75.
|
[17] |
(Zhang Rui, Zhao Dongxiang, Tang Xuli, et al. Research on Interdisciplinary Transfer and Development of Academic Terms from the Perspective of Knowledge Flow[J]. Information Studies: Theory & Application, 2020, 43(1): 47-55, 75.)
|
[18] |
Xu J, Bu Y, Ding Y, et al. Understanding the Formation of Interdisciplinary Research from the Perspective of Keyword Evolution: A Case Study on Joint Attention[J]. Scientometrics, 2018, 117(2): 973-995.
doi: 10.1007/s11192-018-2897-1
|
[19] |
钱升华. 基于孪生网络和BERT模型的主观题自动评分系统[J]. 计算机系统应用, 2022, 31(3): 143-149.
|
[19] |
(Qian Shenghua. Automatic Short Answer Grading Based on Siamese Network and BERT Model[J]. Computer Systems & Applications, 2022, 31(3): 143-149.)
|
[20] |
Reimers N, Gurevych I. Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 3982-3992.
|
[21] |
Li B H, Zhou H, He J X, et al. On the Sentence Embeddings from Pre-trained Language Models[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 9119-9130.
|
[22] |
白思萌, 牛振东, 何慧, 等. 基于超图注意力网络的生物医学文本分类方法[J]. 数据分析与知识发现, 2022, 6(11): 13-24.
|
[22] |
(Bai Simeng, Niu Zhendong, He Hui, et al. Biomedical Text Classification Method Based on Hypergraph Attention Network[J]. Data Analysis and Knowledge Discovery, 2022, 6(11): 13-24.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|