Please wait a minute...
Data Analysis and Knowledge Discovery
Current Issue | Archive | Adv Search |
Extraction of Value Elements of Calligraphy Works and Construction of Index System Based on Hyperplane-Bert-Louvain Optimized LDA Model
Pan Xiaoyu,Ni Yuan,Jin Chunhua,Zhang Jian
(School of Computer Science, Beijing Information Science and Technology University, Beijing 100192, China) (School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, China) (School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] Aiming at the problems of wide differences and lack of standards in the value evaluation of calligraphy works, this paper used big data and artificial intelligence methods to efficiently and accurately identify the value elements of calligraphy works and provide technical support for various calligraphy works trading activities.

[Methods] Combining hyperplane algorithm and Bert model to preprocess calligraphy documents by stop words screening and semantic expansion to form an optimized corpus with high recognition; Constructing complex semantic network of calligraphy literature and introducing Louvain algorithm to determine the optimal number of topics by maximizing the modularity of community network. Therefore, this paper put forward a new method based on "Hyperplane -Bert-Louvain-LDA" (HBL-LDA) to efficiently and accurately construct the evaluation index system of calligraphy value.

[Results] Experiments on calligraphy value literature showed that compared with the traditional LDA model, the precision and F value of the topic recognition of the HBL _ LDA model were increased by 45 % and 29.46 % respectively. What’s more, the average topic quality rate was less 0.96085, and more high-quality topics were identified. Finally, based on the representative calligraphy works, a variety of regression models were used to verify the evaluation index system, and the regression decision tree had the highest accuracy of 84 %.

[Limitations] The new model only constructed an evaluation index system for calligraphy works. In the future, it would incorporate multi-source data of other works of art into the construction of cultural product value indicators. Moreover, because the Bert model don’t consider the topic semantic information, the expansion of similar feature words had certain limitations.

[Conclusions] In this paper, a new model of calligraphy value evaluation index system based on "hyperplane -Bert-Louvain combination optimization LDA model" was proposed, which provided a new direction for the construction of index system in other fields. The index system constructed in this paper was easy to operate and adaptable, and could quickly generate a new index system as demand changes.

Key words Evaluation index system of calligraphy value      LDA      Field stop words      Louvain      Bert      
Published: 17 March 2023

Cite this article:

Pan Xiaoyu, Ni Yuan, Jin Chunhua, Zhang Jian. Extraction of Value Elements of Calligraphy Works and Construction of Index System Based on Hyperplane-Bert-Louvain Optimized LDA Model . Data Analysis and Knowledge Discovery, 0, (): 1-.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022-0915     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y0/V/I/1

[1] Shi Yunmei, Yuan Bo, Zhang Le, Lv Xueqiang. IMTS: Detecting Fake Reviews with Image and Text Semantics[J]. 数据分析与知识发现, 2022, 6(8): 84-96.
[2] Li Hui, Hu Jixia, Tong Zhiying. Subject Topic Mining and Evolution Analysis with Multi-Source Data[J]. 数据分析与知识发现, 2022, 6(7): 44-55.
[3] Wu Jiang, Liu Tao, Liu Yang. Mining Online User Profiles and Self-Presentations: Case Study of NetEase Music Community[J]. 数据分析与知识发现, 2022, 6(7): 56-69.
[4] Zheng Jie, Huang Hui, Qin Yongbin. Matching Similar Cases with Legal Knowledge Fusion[J]. 数据分析与知识发现, 2022, 6(7): 99-106.
[5] Pan Huiping, Li Baoan, Zhang Le, Lv Xueqiang. Extracting Keywords from Government Work Reports with Multi-feature Fusion[J]. 数据分析与知识发现, 2022, 6(5): 54-63.
[6] Xiao Yuejun, Li Honglian, Zhang Le, Lv Xueqiang, You Xindong. Classifying Chinese Patent Texts with Feature Fusion[J]. 数据分析与知识发现, 2022, 6(4): 49-59.
[7] Yang Lin, Huang Xiaoshuo, Wang Jiayang, Ding Lingling, Li Zixiao, Li Jiao. Identifying Subtypes of Clinical Trial Diseases with BERT-TextCNN[J]. 数据分析与知识发现, 2022, 6(4): 69-81.
[8] Guo Hangcheng, He Yanqing, Lan Tian, Wu Zhenfeng, Dong Cheng. Identifying Moves from Scientific Abstracts Based on Paragraph-BERT-CRF[J]. 数据分析与知识发现, 2022, 6(2/3): 298-307.
[9] Yue Tieqi, Fu Youfei, Xu Jian. An Analysis Framework for Job Demands from Job Postings[J]. 数据分析与知识发现, 2022, 6(2/3): 151-166.
[10] Zhou Yunze, Min Chao. Identifying Emerging Technology with LDA Model and Shared Semantic Space——Case Study of Autonomous Vehicles[J]. 数据分析与知识发现, 2022, 6(2/3): 55-66.
[11] Wang Yongsheng, Wang Hao, Yu Wei, Zhou Zeyu. Extracting Relationship Among Characters from Local Chronicles with Text Structures and Contents[J]. 数据分析与知识发现, 2022, 6(2/3): 318-328.
[12] Zhang Yunqiu, Wang Yang, Li Bocheng. Identifying Named Entities of Chinese Electronic Medical Records Based on RoBERTa-wwm Dynamic Fusion Model[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.
[13] Shang Lili, Tang Huayun, Wang Yanzhao, Zuo Meiyun. Identifying Actionable Information from Online Reviews[J]. 数据分析与知识发现, 2022, 6(12): 1-12.
[14] Yan Dongmei, He Wenxin, Chen Zhi. Predicting Stock Prices Based on RoBERTa-TCN and Sentimental Characteristics[J]. 数据分析与知识发现, 2022, 6(12): 123-134.
[15] Hu Zhongyi,Zhang Shuoguo,Wu Jiang. Identifying Phishing Websites Based on URL Multi-Granularity Feature Fusion[J]. 数据分析与知识发现, 2022, 6(11): 103-110.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn