Current Issue
    , Volume 7 Issue 11 Previous Issue    Next Issue
    For Selected: View Abstracts Toggle Thumbnails
    Financial Public Opinion Risk Prediction Model Integrating Knowledge Association and Temporal Transmission
    Chen Haoran, Hong Liang
    2023, 7 (11): 1-13.  DOI: 10.11925/infotech.2096-3467.2022.0928
    Abstract   HTML ( 41 PDF(1240KB) ( 325 )  

    [Objective] This paper studies financial news representations and the supply chain characteristics of particular companies. Then it utilizes these representations and inter-company associations to improve the prediction of public opinion risks for the target company. [Methods] Firstly, we embedded the company association knowledge into financial news texts based on attention mechanism and Bi-LSTM to learn financial news representation to a specific company. Secondly, we organized the financial news sequence into a news risk transmission network based on inter-company knowledge association. Finally, we used the TGAT layer to model the temporal transmission patterns of risk information through inter-company association and aggregate the risk information to predict the financial public opinion risk of the target company. [Results] The proposed method achieved an accuracy of 0.6246 and an AUC of 0.7021 in the financial public opinion risk prediction task, outperforming the baseline methods. [Limitations] The new model only uses the statistical knowledge associations between stocks of the listed companies and does not incorporate other types of inter-company knowledge associations. [Conclusions] The proposed model can effectively learn risk information relevant to the target company from financial news and the temporal transmission characteristics of public opinion risk in inter-company associations. It demonstrates good financial risk prediction performance.

    Figures and Tables | References | Related Articles | Metrics
    A Feature-Enhanced Multi-modal Emotion Recognition Model Integrating Knowledge and Res-ViT
    Yang Ruyun, Ma Jing
    2023, 7 (11): 14-25.  DOI: 10.11925/infotech.2096-3467.2022.1020
    Abstract   HTML ( 17 PDF(2024KB) ( 290 )  

    [Objective] This paper aims to enhance the quality of multi-modal feature extraction and improve the accuracy of netizen sentiment recognition for multi-modal public opinion. [Methods] First, we extracted features of the text modality using RoBERTa and enhanced them with the knowledge phrase representation dictionary. Then, we proposed a Res-ViT model for the graph modality, combining ResNet and Vision Transformer. Finally, we fused multi-modal features with Transformer encoders and fed the representations to the fully connected layer for sentiment recognition. [Results] We evaluated our model using the MVSA-Multiple dataset and achieved an accuracy of 71.66% and an F1 score of 69.42% for sentiment recognition. These improvements were 2.22% and 0.59% over the best scores of the baseline methods. [Limitations] More research is needed to examine the model with other datasets to verify its generalizability and robustness. [Conclusions] The proposed model could more effectively extract and fuse multi-modal features and improve the accuracy of sentiment recognition.

    Figures and Tables | References | Related Articles | Metrics
    Online Sensitive Text Classification Model Based on Heterogeneous Graph Convolutional Network
    Gao Haoxin, Sun Lijuan, Wu Jingchen, Gao Yutong, Wu Xu
    2023, 7 (11): 26-36.  DOI: 10.11925/infotech.2096-3467.2022.1250
    Abstract   HTML ( 24 PDF(1006KB) ( 175 )  

    [Objective] This paper proposes a classification model for sensitive texts in online communities based on a graph neural network, which supports public opinion governance and information security. [Methods] First, we constructed a heterogeneous graph based on sensitive entities of texts and words, which included the existing knowledge about sensitive information of online public opinion. Second, we adopted BERT and GCN to capture high-level semantic information of the text and global co-occurrence features. Third, we combined the complementary advantages of pre-training and graph models to address heterogeneous issues due to structural differences between long and short texts. Finally, we classified sensitive texts based on features of online public opinion. [Results] We examined the proposed model on a self-made sensitive text dataset of online public opinion. The accuracy of our method reached 70.80%, which was 3.52% higher than that of other models. [Limitations] Large heterogeneous graphs built on long texts will reduce the computing speed. [Conclusions] The proposed model could effectively identify and classify sensitive content from different online texts.

    Figures and Tables | References | Related Articles | Metrics
    Analyzing Text Sentiments Based on Patch Attention and Involution
    Lin Zhe, Chen Pinghua
    2023, 7 (11): 37-45.  DOI: 10.11925/infotech.2096-3467.2022.0949
    Abstract   HTML ( 11 PDF(795KB) ( 215 )  

    [Objective] Once the width of the convolution kernel is the same as the dimension of the word vector, the convolution layer will have too many parameters. The sparse connection of convolution operation, the spatial invariance, and the channel specificity of convolution are not suitable for text tasks. This paper will address these issues. [Methods] We proposed a sentiment analysis model for texts based on patch attention mechanism and Involution. The model first transformed the single-word vector after word segmentation and transformed the one-dimensional word vector into n×n word matrix blocks. Then, we spliced the word matrix blocks of multiple words in the sentence into a sentence matrix. Third, the patch attention mechanism layer enhanced the sentence matrix’s context relevance and position order information of text features. Fourth, we used the involution with spatial specificity and channel invariance to extract the sentence matrix features. Finally, we used the full connection layer for text sentiment classification. [Results] We examined the proposed model with three public data sets waimai_10k, IMDB, and Tweet. Its classification precision reached 88.47%, 86.22%, and 94.42%, respectively, which were 6.47%, 7.72%, 9.35% and 1.07%, 1.01%, 0.59% higher than Bi-LSTM model in word vector convolution network and recurrent neural network. [Limitations] The classification accuracy of this model on large datasets is not as high as on small and medium-sized datasets. [Conclusions] The proposed model solves the problems of excessive parameters, sparse connection of convolution operation, spatial invariance, and channel specificity of convolution, which yield better performance than the traditional convolution models.

    Figures and Tables | References | Related Articles | Metrics
    Sentiment Analysis of Micro-blog on Public Health Emergency with Prompt Embedding
    Lai Yubin, Chen Yan, Hu Xiaochun, Huang Xin
    2023, 7 (11): 46-55.  DOI: 10.11925/infotech.2096-3467.2022.0751
    Abstract   HTML ( 12 PDF(1082KB) ( 266 )  

    [Objective] At the early stage of public health emergencies, limited Weibo posts and informal expressions lead to ineffective sentiment analysis. We propose a sentiment analysis model for Weibo posts based on prompt embedding and emotion feature fusion to address this issue. [Methods] First, we extracted the sentiment information from Weibo posts based on the emotional dictionary. Then, we used the pre-trained RoBERTa model to establish semantic and sentiment vectors. We also embedded prompts as prefixes for the semantic vectors. Third, we utilized the Transformer encoder and attention mechanism to extract semantic and emotional features. We also computed the sample feature weights using the focal loss function. Finally, we combined the semantic and emotional features to conduct sentiment analysis. [Results] We examined the new model with Weibo comments on the outbreak of COVID-19 in Shenzhen. The accuracy and F1 score of the model reached 93.46% and 93.49%, which were 6.78% and 6.97% higher than the baseline BERT model. [Limitations] Weibo data contains a large amount of images and videos. However, our model did not include multi-modal fusion for sentiment analysis. [Conclusions] The proposed model could improve the effectiveness of sentiment classification with a small sample data size.

    Figures and Tables | References | Related Articles | Metrics
    Rumor Detection of Public Health Emergencies Based on Data Augmentation and Multi-Task Learning
    Zeng Ziming, Zhang Yu
    2023, 7 (11): 56-67.  DOI: 10.11925/infotech.2096-3467.2022.1012
    Abstract   HTML ( 12 PDF(919KB) ( 232 )  

    [Objective] This paper proposes a new model with data augmentation and multi-task learning, aiming to address the issue of unbalanced data and insufficient labeled data in rumor detection during public health emergencies. [Methods] Firstly, we extracted the text features of public health emergency rumors to construct a replacement word list. Then, we developed the CEDA method based on the extended synonym table to enhance the unbalanced rumor dataset. Third, we built a multi-task learning model to integrate the domain information of public health emergency sentiment classification and rumor detection. Fourth, we obtained the shared features with Transformer and retrieved the unique features of the rumor detection task using the BiLSTM model. Finally, it helped us improve the accuracy of the rumor detection. [Results] The F1 value of the proposed model was 0.972, which was 0.006 and 0.007 higher than the model based on the unbalanced dataset and the single-task learning model. Compared with the DC-CNN model, the F1 value increased by 0.024. [Limitations] The multi-task learning model only includes binary classification of sentiments, requiring more fine-grained negative sentiment classification. [Conclusions] The proposed method can effectively classify public health emergency rumors.

    Figures and Tables | References | Related Articles | Metrics
    Detecting Social Media Rumors Based on Multimodal Heterogeneous Graph
    Qiang Zishan, Gu Yijun
    2023, 7 (11): 68-78.  DOI: 10.11925/infotech.2096-3467.2022.0905
    Abstract   HTML ( 10 PDF(942KB) ( 127 )  

    [Objective] This paper proposes a social media rumor detection model based on the multimodal heterogeneous graph, aiming to verify the correlation between different rumor modalities and improve the accuracy of rumor detection. [Methods] First, we retrieved multimodal posts from social platforms. Then, we extracted feature representations of texts, pictures, and user attributes through preprocessing. Third, we constructed a heterogeneous graph based on the correlation between texts, pictures, and users. Fourth, we extracted the embeddings of text-type nodes according to their specified meta path. Finally, we input the embedding into the classifier to determine whether or not it is a rumor. [Results] We examined the proposed model with two open data sets. The accuracy of our model reached 91.3% and 93.8%, which were also higher than the baseline models. [Limitations] The three types of nodes from the sharing multimodal rumors will make the heterogeneous graph sparse. The proposed model is more suitable for small topic communities. [Conclusions] There is a correlation between different modalities of rumors, which helps the proposed model effectively detect multimodal rumors.

    Figures and Tables | References | Related Articles | Metrics
    Identifying Fake News with External Knowledge and User Interaction Features
    Liu Shuai, Fu Lifang
    2023, 7 (11): 79-87.  DOI: 10.11925/infotech.2096-3467.2022.1144
    Abstract   HTML ( 13 PDF(965KB) ( 300 )  

    [Objective] This paper proposes a multidimensional-data classification model to improve the efficiency of fake news detection. The new model incorporates external knowledge features and user interaction features to reduce fake news spreading in social media. [Methods] First, we extracted the background knowledge of fake news. Then, we introduced external knowledge through the Wikipedia knowledge graph to detect the consistency between the news content and the existing knowledge system. Third, we analyzed the user interaction on the communication chain according to the psychological “similarity effect”. Finally, we improved the connection edge weight of the graph convolutional network to reflect the interaction between users. [Results] We examined the new model’s performance with two public datasets, Twitter15 and Twitter16. Compared with the other five similar models, our model’s accuracy reached 0.901 and 0.927. [Limitations] We did not consider features like knowledge information and language expression hidden in the additional news content. The model’s interpretability needs to be further improved. [Conclusions] By integrating news content, external knowledge, and user interaction characteristics of the communication chain, the proposed model can effectively detect fake news.

    Figures and Tables | References | Related Articles | Metrics
    Identifying Technical Opportunities by Semantic Analysis of Expired Patents and Multi-Dimensional Tech-Innovation Map
    Wang Jinfeng, Wu Xuan, Zhang Dingtang, Feng Lijie, Zhang Ke
    2023, 7 (11): 88-100.  DOI: 10.11925/infotech.2096-3467.2023.0119
    Abstract   HTML ( 11 PDF(1266KB) ( 189 )  

    [Objective] This paper aims to identify technical opportunities from expired patents.[Methods] First, we used SAO semantic analysis to determine problems to be solved in technology fields. Then, we constructed a SAO knowledge base with the core expired patents. Finally, we identified technology opportunities based on a multi-dimensional tech-innovation map. [Results] We examined the proposed model with patents for coal mine dust removal. The three technology opportunities identified by our method could provide decision-making support for enterprises. [Limitations] More research is needed to study patents from other fields and compare our model’s performance with the existing ones. [Conclusions] The proposed model could improve the accuracy and application of technology identification from patents.

    Figures and Tables | References | Related Articles | Metrics
    Early Recognition of User-Generated Content Value with Text Semantics and Associative Network Dual-Link Fusion
    Wang Song, Luo Ying, Liu Xinmin
    2023, 7 (11): 101-113.  DOI: 10.11925/infotech.2096-3467.2022.0993
    Abstract   HTML ( 6 PDF(1797KB) ( 211 )  

    [Objective] This paper proposes a feature system and new model to improve the efficiency of early recognition, aiming to address the issues of time delay and overload in recognizing valuable content from virtual communities. [Methods] We constructed a dual-link fusion algorithm with the text semantics of user-generated content and the network structure of explicit and implicit interaction between users and texts. In the text semantic link, we used the BERT+BiLSTM+Linear to obtain the deep semantic features. In the association network link, we adopted GAT to process the shallow numerical characteristics and association characteristics of the nodes. Finally, we utilized the convolution layer to optimize the fusion information of the above dual links and achieved early value recognition. [Results] The dual-link fusion model had a processing accuracy of 89.80% for data from the Meizu Flyme community, which was 3.45% and 3.20% higher than that of the single text semantic link and associated network link, respectively. Compared with other baseline models, the accuracy and F1 values were also improved. [Limitations] The generalization ability of the model needs to be further improved, and we should have analyzed rich text content (i.e., pictures and external links). [Conclusions] The deep learning fusion model improves the accuracy of early recognition of valuable texts by processing sequential text semantics and topological network structure.

    Figures and Tables | References | Related Articles | Metrics
    Community Detection Algorithm Base on Node and Edge Analysis
    Gao Guangliang, Li Yazhou, Yuan Ming, Wang Qun
    2023, 7 (11): 114-124.  DOI: 10.11925/infotech.2096-3467.2022.1033
    Abstract   HTML ( 7 PDF(1602KB) ( 207 )  

    [Objective] This paper analyzes the importance of network nodes and edges, aiming to improve the performance of community detection algorithms based on objective function optimization. [Methods] First, we measured the importance of nodes based on the triangular structure and constructed a core network by deleting some nodes. Second, we measured the importance of edges based on the triangular structure. Then, we optimized the algorithm with the weighted modularity metric from a local perspective to detect communities in the core network. Finally, we extended these communities to obtain the actual community structure of the original network. [Results] We examined the proposed algorithm on a series of synthetic networks and four real-world network datasets. Our new algorithm’s F1 value was 19.85% higher than the baseline models. It yielded better results on dense networks. [Limitations] The proposed algorithm needs a user-specified parameter. [Conclusions] The proposed algorithm could effectively identify the non-overlapping and overlapping network communities.

    Figures and Tables | References | Related Articles | Metrics
    Evolution of Users' Knowledge Sharing and Hiding Behaviors in Online Health Community
    Huang Zixuan, Xiong Huixiang
    2023, 7 (11): 125-139.  DOI: 10.11925/infotech.2096-3467.2022.1030
    Abstract   HTML ( 10 PDF(1643KB) ( 218 )  

    [Objective] This paper studies the decision-making rules for knowledge sharing and hiding among users of online health communities. It tries to improve the users’ overall health knowledge levels. [Methods] First, we constructed an evolutionary game model for the decision-making mechanism of lurkers and sharers’ knowledge-sharing and hiding behaviors. Then, we retrieved data on breast cancer topics from the Zhihu platform to assign model parameters. Finally, we conducted numerical experiments with Matlab to explore the impacts of parameter changes. [Results] The transformation of users from knowledge hiding to sharing was positively affected by the benefits of knowledge innovation, emotion, and community rewards. Privacy risks and coding costs posed negative impacts on user behaviors. The two user groups had different sensitivity levels to the factors. [Limitations] We did not set a nonlinear utility function; manually labeling data may yield errors. [Conclusions] This paper could help the online health community transform users from knowledge hiding to sharing.

    Figures and Tables | References | Related Articles | Metrics
    Analyzing Researchers’ Interdisciplinarity and Academic Impacts
    Zhai Yujia, Zhou Rui, Li Yan, Mao Zhigang
    2023, 7 (11): 140-157.  DOI: 10.11925/infotech.2096-3467.2022.1167
    Abstract   HTML ( 12 PDF(1373KB) ( 235 )  

    [Objective] This study explores the relationship between interdisciplinarity and individual academic impacts, aiming to promote the development of interdisciplinary research. [Methods] First, we retrieved 69,759 researchers from the Semantic Scholar database. Then, we used the Brillouin index to analyze their interdisciplinarity based on citation, publication, and collaboration. Finally, we applied the generalized propensity value matching method to examine the causal effect of interdisciplinarity on individual academic influence. [Results] For interdisciplinary citations, the publication number and h-index of researchers increased with the rise of multi-disciplinary citations. They surpassed the critical points (1.5 and 0.05) before subsequently declining. However, interdisciplinary citation does not impact the average citation number per paper. For interdisciplinary publications, as researchers published across more disciplines, the publication number and h-index showed an upward trend, while the average citation per paper showed an oscillatory increase. For interdisciplinary collaboration, the publication number of researchers steadily increases with the growth of interdisciplinary collaboration, albeit with a gradually diminishing rate of increase. However, interdisciplinary collaboration did not influence the h-index or average citation per paper. [Limitations] The study did not include the weighting of each dimension in the measurement indicators or establish an evaluative framework for a comprehensive interdisciplinary index encompassing all three dimensions. [Conclusions] Engaging in interdisciplinary research can conditionally enhance the academic impact of researchers, while different evaluation dimensions yield various results.

    Figures and Tables | References | Related Articles | Metrics
    Identifying and Extracting Figures and Tables from Academic Literature Based on YOLOv5-ECA-BiFPN
    Li Yingqun, Li Yafei, Pei Lei, Hu Zhiwei, Song Ningyuan
    2023, 7 (11): 158-171.  DOI: 10.11925/infotech.2096-3467.2022.1026
    Abstract   HTML ( 16 PDF(22765KB) ( 116 )  

    [Objective] This paper aims to accurately identify and extract figures and tables from academic literature, which promotes the dissemination of academic achievements. [Methods] First, we introduced the ECA channel attention module into the YOLOv5 algorithm and replaced the PAN module with BiFPN. Then, we randomly chose 1300 scholarly articles from thirteen subjects as experimental data and converted them to high-quality images using poppler-0.68.0. Finally, we examined the performance of the new algorithm on this dataset. [Results] Compared with the suboptimal algorithm, the F1 value of the new model improved by 1.99% to 99.88% when applied to the dataset. [Limitations] The scope and quantity of data annotation needs to be expanded to more scenarios. [Conclusions] YOLOv5-ECA-BiFPN can effectively improve the recognition of figures and tables from academic journals.

    Figures and Tables | References | Related Articles | Metrics
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn