Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (11): 101-113    DOI: 10.11925/infotech.2096-3467.2022.0993
Current Issue | Archive | Adv Search |
Early Recognition of User-Generated Content Value with Text Semantics and Associative Network Dual-Link Fusion
Wang Song1(),Luo Ying1,Liu Xinmin2
1College of Economics & Management, Shandong University of Science and Technology, Qingdao 266590, China
2College of Economics & Management, Qingdao Agricultural University,Qingdao 266109, China
Download: PDF (1797 KB)   HTML ( 6
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a feature system and new model to improve the efficiency of early recognition, aiming to address the issues of time delay and overload in recognizing valuable content from virtual communities. [Methods] We constructed a dual-link fusion algorithm with the text semantics of user-generated content and the network structure of explicit and implicit interaction between users and texts. In the text semantic link, we used the BERT+BiLSTM+Linear to obtain the deep semantic features. In the association network link, we adopted GAT to process the shallow numerical characteristics and association characteristics of the nodes. Finally, we utilized the convolution layer to optimize the fusion information of the above dual links and achieved early value recognition. [Results] The dual-link fusion model had a processing accuracy of 89.80% for data from the Meizu Flyme community, which was 3.45% and 3.20% higher than that of the single text semantic link and associated network link, respectively. Compared with other baseline models, the accuracy and F1 values were also improved. [Limitations] The generalization ability of the model needs to be further improved, and we should have analyzed rich text content (i.e., pictures and external links). [Conclusions] The deep learning fusion model improves the accuracy of early recognition of valuable texts by processing sequential text semantics and topological network structure.

Key wordsEarly Recognition      Fusion Model      BiLSTM      Graph Attention Network     
Received: 21 September 2022      Published: 22 March 2023
ZTFLH:  G206  
Fund:National Natural Science Foundation of China(71471105);Social Science Planning Project of Shandong Province(18CGLJ38)
Corresponding Authors: Wang Song,ORCID:0000-0001-9101-7702,E-mail: tiatusw@126.com。   

Cite this article:

Wang Song, Luo Ying, Liu Xinmin. Early Recognition of User-Generated Content Value with Text Semantics and Associative Network Dual-Link Fusion. Data Analysis and Knowledge Discovery, 2023, 7(11): 101-113.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0993     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I11/101

The Explicit and Implicit Interaction Network of User-Text Entity
所属网络 名称 符号 含义
用户关联
网络
用户权威性 authority 用户节点的威望
用户活跃度 activity 用户节点的发帖数量
度中心性 degree 与其他节点的连接数量
中间中心性 betweenness 用户节点的重要性
接近中心性 closeness 用户节点到其他节点的距离
用户领袖性 pagerank 用户节点的影响力
文本关联
网络
文本长度 length 文本的长度
丰富性 richness 文本是否含有图片、链接
情感极性 emotion 文本所彰显的情感
互动性 interactivity 文本内容中所涉及的人称表述
准确性 accuracy 文本内容中主题概率
Node Feature of Dual Network
Early Recognition Model of Content Value Based on Dual Link Fusion
Structure Diagram of BERT
符号 含义 符号 含义
post_id 文本ID post_them 文本主题
post_content 文本内容 post_picture 文本中是否有图片
author_id 发帖人ID post_num 发帖数量
author_vip 发帖人头衔 author_reputation 发帖人声望
listen_num 关注数量 fans_num 粉丝数量
review_author_id 评论人ID review_content 评论内容
Symbols and Meanings of Data
Changes of Topic Confusion, Consistency
Word Cloud
内容 权威人员评论 初次标注 优化标注 最终标注
1. 17pro更新F9系统后息屏状态下会闪屏;2. 偶尔指纹识别位置会常量,
且不能识别指纹,需要重启……
反馈后等优化吧。 1 1 1
能否增加侧滑返回的震动?之前用过一段时间小米11,侧滑返回震动挺
舒服的,换回魅族猛一下还有点不太适应,希望后续增加。
会做相关考虑。 1 1 1
安装未知应用检测如题,每次安装是都要检测一下是否通过魅族商城验证,
这个东西能不能关掉,个人感觉没有用处。
这个是安全提示,有些用户可能会需要。 1 1 1
魅族18丐版谁要,6月份买的18,白色,低价,有人要不! 温馨提示:网络交易请注意交易风险。 1 0 0
我想我要换块屏幕了,魅族17,煤油们有推荐的店嘛,我要换块屏幕,摔
得有一点断触,煤油们新年快乐。
建议选择官方配件。 1 0 0
Example of Labeling
实验参数名称 参数值 实验参数名称 参数值
学习率(lr 1×10-5 关联网络链路学习率(g_lr 0.000 1
训练迭代数(epochs 30 关联网络链路迭代次数(g_epochs 1000
训练批量数(batch_size 64 图注意力层的头数(layer 8
文本序列长度(max_length 100 GAT处理维度(hidden_dim 8
字向量嵌入维度(in_dim 768 卷积核大小(kernel_size 1
BiLSTM处理维度(hidden_dim 100 Conv处理维度(out_channels 2
Linear处理维度(output_size 2 融合模型失活率(dropout 0.5
优化器(Optimizer Adam L2正则项参数(Weight_decay 5×10-4
Parameter Settings
模型 Acc/% F1/% P/% R/%
双链路融合模型 89.80 77.21 84.34 71.19
文本语义链路BERT+BiLSTM+Linear 86.35 73.65 68.64 79.45
关联网络链路GAT 86.60 72.50 78.35 67.46
Result of Dual-Link Fusion
单链路 模型 Acc/% F1/% P/% R/%
文本语义
链路
BERT+BiLSTM+Linear 86.35 73.65 68.64 79.45
Embedding+BiLSTM 83.96 53.01 89.43 37.67
BERT+BiLSTM 85.44 60.05 89.86 45.08
BERT+CNN+Linear 84.70 70.94 65.23 77.74
关联网络
链路
图注意力神经网络 86.60 72.50 78.35 67.46
卷积神经网络 82.89 46.90 94.57 31.18
全连接网络 32.99 33.45 21.97 70.04
Result of Single Link
[1] 王楠, 陈详详, 祁运丽, 等. 基于详尽可能性模型的用户创新社区创意采纳影响因素研究[J]. 中国管理科学, 2020, 28(3): 213-222.
[1] (Wang Nan, Chen Xiangxiang, Qi Yunli, et al. The Research on Influence Factors of User Innovation Community Idea Adoption Based on Elaboration Likelihood Model[J]. Chinese Journal of Management Science, 2020, 28(3): 213-222.)
[2] 易明, 张婷婷. 大众性问答社区答案质量排序方法研究[J]. 数据分析与知识发现, 2019, 3(6): 2-20.
[2] (Yi Ming, Zhang Tingting. Ranking Answer Quality of Popular Q&A Community[J]. Data Analysis and Knowledge Discovery, 2019, 3(6): 12-20.)
[3] 马帅, 刘建伟, 左信. 图神经网络综述[J]. 计算机研究与发展, 2022, 59(1): 47-80.
[3] (Ma Shuai, Liu Jianwei, Zuo Xin. Survey on Graph Neural Network[J]. Journal of Computer Research and Development, 2022, 59(1): 47-80.)
[4] 史加荣, 马媛媛. 深度学习的研究进展与发展[J]. 计算机工程与应用, 2018, 54(10): 1-10.
doi: 10.3778/j.issn.1002-8331.1712-0418
[4] (Shi Jiarong, Ma Yuanyuan. Research Progress and Development of Deep Learning[J]. Computer Engineering and Applications, 2018, 54(10): 1-10.)
doi: 10.3778/j.issn.1002-8331.1712-0418
[5] Sherstinsky A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network[J]. Physica D: Nonlinear Phenomena, 2020, 404: 132306.
doi: 10.1016/j.physd.2019.132306
[6] Chaudhari S, Mithal V, Polatkan G, et al. An Attentive Survey of Attention Models[J]. ACM Transactions on Intelligent Systems and Technology, 2021, 12(5): 1-32.
[7] 李德顺. 价值论: 一种主体性的研究[M]. 第3版. 北京: 中国人民大学出版社, 2013.
[7] (Li Deshun. Axiology: A Study of Subjectivity[M]. The 3rd Edition. Beijing: China Renmin University Press, 2013.)
[8] 唐晓波, 向莉丽, 牟昊. 基于研究问题与研究方法贡献的论文学术价值早期识别方法[J]. 情报科学, 2022, 40(9): 3-11, 19.
[8] (Tang Xiaobo, Xiang Lili, Mou Hao. Early Identification Method of Academic Value of Papers Based on Research Question and Research Method Contribution[J]. Information Science, 2022, 40(9): 3-11, 19.)
[9] 王松, 杨洋, 刘新民. 基于图注意力网络的开放式创新社区用户创意潜在价值发现研究[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
[9] (Wang Song, Yang Yang, Liu Xinmin. Discovering Potentialities of User Ideas from Open Innovation Communities with Graph Attention Network[J]. Data Analysis and Knowledge Discovery, 2021, 5(11): 89-101.)
[10] 李蕾, 张琳琳, 王傲, 等. 社交媒体环境下学术型用户生成内容质量评估研究[J]. 情报理论与实践, 2023, 46(2): 175-183.
[10] (Li Lei, Zhang Linlin, Wang Ao, et al. Quality Evaluation of Academic User Generated Content on Social Media[J]. Information Studies: Theory & Application, 2023, 46(2): 175-183.)
[11] 周知, 李名子, 崔旭. 基于领域情感词典的用户生成内容有用性评价研究——以豆瓣读书为例[J]. 情报理论与实践, 2022, 45(1): 86-92.
doi: 10.16353/j.cnki.1000-7490.2022.01.012
[11] (Zhou Zhi, Li Mingzi, Cui Xu. Research on Helpfulness Evaluation of User Generate Content Based on Domain Sentiment Lexicon: Taking Douban Reading as an Example[J]. Information Studies: Theory & Application, 2022, 45(1): 86-92.)
doi: 10.16353/j.cnki.1000-7490.2022.01.012
[12] 洪闯, 李贺, 毛太田. 开放式创新社区用户知识贡献的采纳机理研究[J]. 现代情报, 2020, 40(5): 33-40.
doi: 10.3969/j.issn.1008-0821.2020.05.005
[12] (Hong Chuang, Li He, Mao Taitian. Study on the Adoption Mechanism of Knowledge Contribution from Open Innovation Community Users[J]. Journal of Modern Information, 2020, 40(5): 33-40.)
doi: 10.3969/j.issn.1008-0821.2020.05.005
[13] 陶晓波, 徐鹏宇, 樊潮, 等. 创新社区中新产品开发人员信息采纳行为的影响机理研究[J]. 管理评论, 2020, 32(10): 135-146.
[13] (Tao Xiaobo, Xu Pengyu, Fan Chao, et al. Research on the Influence Mechanism of Information Adoption Behavior of New Product Developers in Innovation Community[J]. Management Review, 2020, 32(10): 135-146.)
[14] Han C J, Yang M. Stimulating Innovation on Social Product Development: An Analysis of Social Behaviors in Online Innovation Communities[J]. IEEE Transactions on Engineering Management, 2022, 69(2): 365-375..
doi: 10.1109/TEM.2019.2955073
[15] Zhang M, Fan B, Zhang N, et al. Mining Product Innovation Ideas from Online Reviews[J]. Information Processing & Management, 2021, 58(1): 102389.
doi: 10.1016/j.ipm.2020.102389
[16] 易明, 李藿然, 刘继月. 基于GloVe-BiLSTM的在线研讨信息分类模型研究[J]. 情报理论与实践, 2022, 45(9): 173-179.
doi: 10.16353/j.cnki.1000-7490.2022.09.023
[16] (Yi Ming, Li Huoran, Liu Jiyue. Research on Online Discussion Information Classification Model Based on GloVe-BiLSTM[J]. Information Studies: Theory & Application, 2022, 45(9): 173-179.)
doi: 10.16353/j.cnki.1000-7490.2022.09.023
[17] 韩普, 张伟, 张展鹏, 等. 基于特征融合和多通道的突发公共卫生事件微博情感分析[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[17] (Han Pu, Zhang Wei, Zhang Zhanpeng, et al. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. Data Analysis and Knowledge Discovery, 2021, 5(11): 68-79.)
[18] 汪兰兰, 姚春龙, 李旭, 等. 结合依存句法分析与交互注意力机制的隐式方面提取[J]. 计算机应用研究, 2022, 39(1): 37-42.
[18] (Wang Lanlan, Yao Chunlong, Li Xu, et al. Combining Dependency Syntactic Parsing with Interactive Attention Mechanism for Implicit Aspect Extraction[J]. Application Research of Computers, 2022, 39(1): 37-42.)
[19] 张合桥, 苟刚, 陈青梅. 基于图神经网络的方面级情感分析[J]. 计算机应用研究, 2021, 38(12): 3574-3580, 3585.
[19] (Zhang Heqiao, Gou Gang, Chen Qingmei. Aspect-Based Sentiment Analysis Based on Graph Neural Network[J]. Application Research of Computers, 2021, 38(12): 3574-3580, 3585.)
[20] Chen W Y, Chen H H. Collaborative Co-Attention Network for Session-Based Recommendation[J]. Mathematics, 2021, 9(12): 1392.
doi: 10.3390/math9121392
[21] 张继东, 蒋丽萍. 基于多模态深度学习的旅游评论反讽识别研究[J]. 情报理论与实践, 2022, 45(7): 158-164.
doi: 10.16353/j.cnki.1000-7490.2022.07.022
[21] (Zhang Jidong, Jiang Liping. Research on Irony Recognition of Travel Reviews Based on Multi-Modal Deep Learning[J]. Information Studies: Theory & Application, 2022, 45(7): 158-164.)
doi: 10.16353/j.cnki.1000-7490.2022.07.022
[22] 蒋雨肖, 丁晟春, 吴鹏. 基于BiLSTM-VGG16的多模态信息特征分类研究[J]. 情报理论与实践, 2021, 44(11): 180-186, 179.
doi: 10.16353/j.cnki.1000-7490.2021.11.024
[22] (Jiang Yuxiao, Ding Shengchun, Wu Peng. A Study on the Classification of Features of Multi-Modal Information Based on BiLSTM-VGG16[J]. Information Studies: Theory & Application, 2021, 44(11): 180-186, 179.)
doi: 10.16353/j.cnki.1000-7490.2021.11.024
[23] 许晶航, 左万利, 梁世宁, 等. 基于图注意力网络的因果关系抽取[J]. 计算机研究与发展, 2020, 57(1): 159-174.
[23] (Xu Jinghang, Zuo Wanli, Liang Shining, et al. Causal Relation Extraction Based on Graph Attention Networks[J]. Journal of Computer Research and Development, 2020, 57(1): 159-174.)
[24] Sussman S W, Siegal W S. Informational Influence in Organizations: An Integrated Approach to Knowledge Adoption[J]. Information Systems Research, 2003, 14(1): 47-65.
doi: 10.1287/isre.14.1.47.14767
[25] 沈旺, 李世钰, 刘嘉宇, 等. 问答社区回答质量评价体系优化方法研究[J]. 数据分析与知识发现, 2021, 5(2): 83-93.
[25] (Shen Wang, Li Shiyu, Liu Jiayu, et al. Optimizing Quality Evaluation for Answers of Q&A Community[J]. Data Analysis and Knowledge Discovery, 2021, 5(2): 83-93.)
[26] 严炜炜, 黄为, 温馨. 学术社交网络问答质量智能评价与服务优化研究[J]. 图书情报工作, 2021, 65(6): 129-137.
doi: 10.13266/j.issn.0252-3116.2021.06.014
[26] (Yan Weiwei, Huang Wei, Wen Xin. Intelligent Quality Evaluation and Service Optimization of Q&A in Academic Social Networking Site[J]. Library and Information Service, 2021, 65(6): 129-137.)
doi: 10.13266/j.issn.0252-3116.2021.06.014
[27] 郭顺利, 张向先, 陶兴, 等. 社会化问答社区用户生成答案质量自动化评价研究——以“知乎”为例[J]. 图书情报工作, 2019, 63(11): 118-130.
doi: 10.13266/j.issn.0252-3116.2019.11.013
[27] (Guo Shunli, Zhang Xiangxian, Tao Xing, et al. Research on Automated Evaluation of User Generated Answer Quality in Social Question and Answer Community—Taking “Zhihu” as an Example[J]. Library and Information Service, 2019, 63(11): 118-130.)
doi: 10.13266/j.issn.0252-3116.2019.11.013
[28] Bonacich P. Factoring and Weighting Approaches to Status Scores and Clique Identification[J]. The Journal of Mathematical Sociology, 1972, 2(1): 113-120.
doi: 10.1080/0022250X.1972.9989806
[29] Freeman L C. A Set of Measures of Centrality Based on Betweenness[J]. Sociometry, 1977, 40(1): 35.
doi: 10.2307/3033543
[30] Bavelas A. Communication Patterns in Task-Oriented Groups[J]. The Journal of the Acoustical Society of America, 1950, 22(6): 725-730.
doi: 10.1121/1.1906679
[31] 杨东红, 吴邦安, 孙晓春. 基于机器学习的网络评论信息有用性预测模型研究[J]. 情报科学, 2019, 37(12): 34-39, 77.
[31] (Yang Donghong, Wu Bangan, Sun Xiaochun. Research on the Helpfulness Prediction Model of Online Review Information Based on Machine Learning[J]. Information Science, 2019, 37(12): 34-39, 77.)
[32] 张瑞, 何禄鑫, 黄炜. 多特征融合下视频网站弹幕信息有用性检测研究[J]. 现代情报, 2022, 42(4): 99-109.
doi: 10.3969/j.issn.1008-0821.2022.04.009
[32] (Zhang Rui, He Luxin, Huang Wei. Research on Usefulness Detection of Danmaku Information in Video Websites Based on Multi-Feature Fusion[J]. Journal of Modern Information, 2022, 42(4): 99-109.)
doi: 10.3969/j.issn.1008-0821.2022.04.009
[33] 陈远高, 应梦茜, 毕然, 等. 管理者回复对在线评论与有用性关系的调节效应: 基于TripAdvisor的实证研究[J]. 管理工程学报, 2021, 35(5): 110-116.
[33] (Chen Yuangao, Ying Mengqian, Bi Ran, et al. The Moderating Effect of Manager Response on the Relationship Between Online Review and Review Helpfulness: An Empirical Study of TripAdvisor[J]. Journal of Industrial Engineering and Engineering Management, 2021, 35(5): 110-116.)
[1] Wang Song, Xu Yajing, Liu Xinmin. Identify Innovation Value of User-Generated Content in Virtual Communities with Conv-BiLSTM: An Interactive and Collaborative Perspective[J]. 数据分析与知识发现, 2023, 7(4): 77-88.
[2] Liu Xiang, Liu Xiang, Yu Bowen. Early Identification of Star Inventor Types in the Perspective of Innovation Duality[J]. 数据分析与知识发现, 2023, 7(2): 119-128.
[3] Wang Dailin, Liu Lina, Liu Meiling, Liu Yaqiu. Reader Preference Analysis and Book Recommendation Model with Attention Mechanism of Catalogs[J]. 数据分析与知识发现, 2022, 6(9): 138-152.
[4] Zhang Zhipeng, Mao Yusheng, Zhang Liyi. Classifying Reasons of Hotel Reviews with Domain ERNIE and BiLSTM Model[J]. 数据分析与知识发现, 2022, 6(9): 65-76.
[5] Hu Jiming, Qian Wei, Wen Peng, Lv Xiaoguang. Text Semantic Representation with Structure-Function and Entity Recognition: Case Study of Medical Records[J]. 数据分析与知识发现, 2022, 6(8): 110-121.
[6] Xie Xingyu, Yu Bengong. Automatic Classification of E-commerce Comments with Multi-Feature Fusion Model[J]. 数据分析与知识发现, 2022, 6(1): 101-112.
[7] Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[8] Wang Hao, Lin Kerou, Meng Zhen, Li Xinlei. Identifying Multi-Type Entities in Legal Judgments with Text Representation and Feature Generation[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[9] Yang Siluo, Xiao Aoxia. Two-layer Transmission Model of WeChat Public Account with Bass Model and SIR Model[J]. 数据分析与知识发现, 2021, 5(12): 74-87.
[10] Wang Song, Yang Yang, Liu Xinmin. Discovering Potentialities of User Ideas from Open Innovation Communities with Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
[11] Na Ma,Zhixiong Zhang,Pengmin Wu. Automatic Identification of Term Citation Object with Feature Fusion[J]. 数据分析与知识发现, 2020, 4(1): 89-98.
[12] Meishan Chen,Chenxi Xia. Identifying Entities of Online Questions from Cancer Patients Based on Transfer Learning[J]. 数据分析与知识发现, 2019, 3(12): 61-69.
[13] Kan Liu,Haochen Du. Detecting Twitter Rumors with Deep Transfer Network[J]. 数据分析与知识发现, 2019, 3(10): 47-55.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn