Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (2/3): 239-248    DOI: 10.11925/infotech.2096-3467.2019.0550
Real-time Analysis Model for Short Texts with Relationship Graph of Domain Semantics
Tian Zhonglin1,2,Wu Xu1,2,3(),Xie Xiaqing1,2,Xu Jin1,2,Lu Yueming1,2
1School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
2Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education, Beijing 100876, China
3Beijing University of Posts and Telecommunications Library, Beijing 100876, China
[Objective] This paper studies the domain discrimination for public opinions of online communities, aiming to improve knowledge base, as well as the effectiveness of the machine learning models.[Methods] We retrieved 478,303 pieces of textual data from multiple online communities for college students. Then, we created a semantic relationship graph with a total of 5,248 nodes and 16,488 edges, which could also be extended automatically. Finally, we proposed a short text analysis model to conduct domain analysis for the texts.[Results] The F value of the proposed model reached 83.94%, which was 8.56%, 5.97% and 4.27% higher than those of the SVM, NB and CNN methods.[Limitations] The sample size needs to be expanded and the parameter feedback mechanism needs to be modified.[Conclusions] Compared with methods based on machine learning, the proposed model’s accuracy is improved. It could also conduct real-time analysis.

Key wordsSemantic Relation Graph      Text Analysis      Semantic Computation     
Received: 24 May 2019      Published: 26 April 2020
ZTFLH:  TP391  
Corresponding Authors: Xu Wu     E-mail:

Cite this article:

Tian Zhonglin,Wu Xu,Xie Xiaqing,Xu Jin,Lu Yueming. Real-time Analysis Model for Short Texts with Relationship Graph of Domain Semantics. Data Analysis and Knowledge Discovery, 2020, 4(2/3): 239-248.

URL:     OR

System Architecture
词语 自然属性 领域专属属性
POS Engilsh 关注指数 类型
偷窃 v steal 8 PS(财产安全)
小偷 n thief 6 PS(财产安全)
Semantic Node Property
Examples of Semantic
关系起始项 关系终止项 语义关系
食堂 地点关系
今天中午 时间关系
苍蝇 主动关系
苍蝇 恶心 因果关系
恶心 并列关系、近义关系
恶心 并列关系、近义关系
恶心 领导 目的关系
领导 目的关系
领导 管理 主动关系
Semantic Relation
The Process of Building Semantic Diagrams
Part of the Semantic Diagram of University Public Opinion
数据集 领域相关文本数量 文本总数量
训练集 7 350 20 000
测试集 6 840 20 000
The Statistics of Experimental Data Set
The Accuracy of Different Threshold
测试方法 P(%) R(%) F(%)
SVM 78.23 72.35 75.18
NB 76.36 79.24 77.77
CNN 80.64 78.35 79.47
本文短文本实时分析方法 84.32 83.12 83.74
The Results of the Accuracy Test
测试方法 数据流量(篇/秒) 延迟时间(秒)
实时主题检测TopicSketch 50 0.71±0.5
本文短文本实时分析方法 22 1.36±0.4
The Results of the Timeliness Test
