Home Browse Online first

Online first

The manuscripts published below will continue to be available from this page until they are assigned to an issue.
Please wait a minute...
  • Select all
    |
  • Yao Yuanzhang, Xu Jian
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2025.0011
    Online available: 2025-07-04

    [Objective]This study aims to analyze the phenomenon of semantic differences of interdisciplinary terms across different fields and explore the underlying causes of these semantic variations.[Methods]We utilize pre-trained deep learning models to automate the identification and quantification of semantic differences in terms. A semantic difference degree indicator is designed to quantitatively measure the extent of these differences, and a co-occurrence analysis is conducted for the disciplines involved in the terms.[Results]The identification accuracy of semantic differences based on the pre-trained model reaches 0.8193, and the constructed measurement indicators effectively quantified semantic differences.[Limitations]The study is limited to the semantic differences of Chinese terminology, with a restricted scope in terms of the interdisciplinary range of the terms selected.[Conclusions] The main causes of semantic differences in interdisciplinary terms are identified as: specialization and fragmentation of disciplines, linguistic and contextual differences, hierarchical and abstract conceptualization, cognitive emphasis differences, and the influence of interdisciplinary intersection and integration. This provides new perspectives and methodologies for exploring the reasons behind terminological discrepancies and their relationships with disciplines.

  • Deng Hangyu, Tang Chuan, Pu Yunqiang, Ao Lijuan, Wang Wanjing
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0993
    Online available: 2025-07-04

    [Objective] Given the characteristics of large volume, broad scope, and frequent colloquial expressions in U.S. Congressional hearing transcripts, this paper proposes a framework for automatically identifying China’s science and technology security risks. [Methods] Starting from the data features of the hearings and the actual needs of analysts, this study realizes and integrates modules such as text filtering, summary generation, and question-answering by utilizing large language models. [Results] Using the 118th Congress hearings as experimental texts, the F1 score for text filtering, ROUGE-Lsum for summary generation, and the risk point recall rate for the QA system reached 0.7751, 0.6032, and 0.7636 respectively, significantly outperforming the baselines. [Limitations] This method is primarily designed for U.S. Congressional hearing transcripts and needs further validation with more types of data to consider it a general approach. [Conclusions] The proposed method can assist researchers in better extracting technological security risks from U.S. Congressional sources and preparing corresponding strategies.

  • Zhang Shuangbao, Cheng Quan, Zeng Yan
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1126
    Online available: 2025-07-04

    [Objective]The utilization of semantic association information between Chinese texts is imperative to enhance the efficacy of extracting unstructured events from the text.[Methods]The present study proposes a Chinese document-level event extraction model (CSDEE) that utilizes an attention mechanism to construct a cross-document interactive semantic network, with the objective of enhancing entity recognition performance. The event extraction task is then completed through document encoding and event extraction information decoding.[Results]The experimental results demonstrate that the CSDEE model attains 80.7%, 84.1%, and 82.3% accuracy, recall, and F1 score in event extraction, respectively, outperforming existing baseline models.The ablation experiments conducted on the model and the generalization experiments on the public datasets ChFinAnn and DuEE-fin further substantiate the efficacy of the model in Chinese document event extraction tasks.[Limitations]At present, the model has only enhanced the performance of document event extraction and has not yet engaged in multi-classification tasks for overlapping event types.[Conclusions] A comprehensive exploration of the parallel semantic information inherent in document-level data has the potential to enhance the precision of document event extraction operations.

  • XIEWei, XIA Hongbin, LIU Yuan
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1132
    Online available: 2025-07-04

    [Objective] This study aims to utilize deep learning methods to address the current issue of insufficient utilization of complete entity and relation interaction information in zero-shot relation extraction tasks. [Methods] We propose a Joint Contrastive Learning model (JCL) for zero-shot relation extraction, which integrates entity and relation information based on contrastive learning. Firstly, data augmentation techniques are applied to the original input text to enhance the model's effective information. Secondly, an enhanced cross-attention module is used to deeply integrate entity pairs and jointly process relations, extracting interaction information between entities as well as between entities and relational semantics, thereby amplifying the subtle differences of various relations in the embedding space. Finally, the model is optimized using a combination of cross-entropy loss and contrastive loss. [Results] Compared with the baseline model, the proposed approach achieves improvements on the FewRel dataset with unseen relations: an F1 score increase of 3.12% for 𝑚=5, 5.19% for 𝑚=10, and 1.99% for 𝑚=15. On the Wiki-ZSL dataset, improvements are 7.05% for 𝑚=5, 3.42% for 𝑚=10, and 8.08% for 𝑚=15. [Limitations] The study is limited by the relatively homogeneous and small number of datasets used in this research field. [Conclusions] The proposed Joint Contrastive Learning model for zero-shot relation extraction demonstrates advanced performance on three public datasets, showcasing its efficacy for this specific task.

  • Shengli Zhou, Rui Xu, Tinggui Chen, Shaojie Wang
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1138
    Online available: 2025-07-04

    [Objective] To address the insufficient characterization of multimodal features in the AI face-swap fraud process, this study establishes a face-swapping fraud risk identification model (FSFRI) that synergistically integrates multimodal features to optimize victimization risk assessment. [Methods] By comprehensively considering the generation and propagation processes of AI face-swapping fraud, FSFRI extracts four types of features: fake face video frames, traffic composition description features, traffic payload data features, and traffic temporal features. Through the feature fusion module, it achieves complementary integration of cross-modal features. Finally, via risk identification module, FSFRI effectively detects and identifies deception risks. [Results] In the dataset generated through simulation experiments, the FSFRI achieved good identification performance, with an F1 score of 0.92. It also demonstrated strong robustness in low-noise environments (with noise levels ranging from 0 to 0.2), and the F1 score only decreases by 0.019 at a noise ratio of 0.2. [Limitations] Due to the increased complexity of FSFRI from using multimodal features, the model faces higher computational performance demands. The FSFRI's risk identification effectiveness in high-noise environments remains to be further enhanced. [Conclusions] FSFRI can effectively extract and integrate the multi-modal features generated in the process of AI face-changing fraud, and precisely identify AI face-swapping fraud victimization risks.

  • Ma Yingxue, Gan Mingxin, Hu Lei
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1235
    Online available: 2025-07-04

    [Objective] To address the issue that deep learning recommendation methods lack modeling of user interest distribution characteristics and cannot fully capture user preferences, a sequential recommendation method based on modeling the aggregative and hierarchical distribution characteristics of user interests is proposed. [Methods] Using attention network and LSTM, representation vectors of users and items are obtained from behavioral sequences, and the positional centers and boundary radii of user interest distributions are learned. The hierarchy and aggregation of interest distribution are characterized by two radii. User preferences are predicted by fitting the distance between candidate item features and the distribution center of user interest to interaction probability. Recommendations are generated by fusing behavior predictions based on neural networks with preference estimation based on interest model. [Results] Experimental results on Amazon dataset demonstrate that compared to the best-performing baseline, the proposed method achieves optimal performance in terms of precision, recall, F-score, coverage and other evaluation metrics, with performance improvements exceeding 10 percentage points. [Limitations] User generated content besides behavior sequence is not considered. Future work can improve interest modeling by integrating user comments and other information. [Conclusions] This method can accurately describe the distribution characteristics of user interest, improve the accuracy of recommendation, and optimize the comprehensive quality of recommendation results.

  • Sun Mengge, Wang Yanpeng, Fu yun, Liu Xiwen
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1192
    Online available: 2025-07-04

    [Objective] This study explores the construction of prompt engineering methods for large language models in the task of multi-domain scientific knowledge entity extraction, using scientific short texts as experimental data. The aim is to address the challenges posed by insufficient semantic context and domain diversity in short text entity extraction.[Methods] To tackle the issues of wide domain coverage, condensed semantics leading to insufficient contextual information, and ambiguous entity boundaries in short texts, this study proposes a Scientific Prompt-based entity extraction strategy grounded in knowledge-prompt learning. By integrating the BERTopic method, the strategy dynamically incorporates domain knowledge into prompt design to enhance the semantic understanding and recognition capabilities of large language models, thereby improving extraction accuracy and generalization.[Results] Experimental results demonstrate that under the Scientific Prompt strategy, the F1-Value scores of QWEN2.5-7B, QWEN2.5-7B (fine-tuned), and GPT-4o models are 0.6526, 0.7407, and 0.7878, respectively. In contrast, the Zero-Shot F1-Values for the same models are 0.5534, 0.6165, and 0.6822, respectively. The results indicate that the Scientific Prompt strategy significantly outperforms fine-tuning in open-source models (0.6526 vs 0.6165), with the fine-tuned QWEN2.5-7B model under the prompt strategy slightly surpassing the performance of GPT-4o (0.7407 vs 0.6822).[Limitations] This study only evaluates the proposed strategy on Chinese scientific intelligence short texts, and its applicability to English texts remains untested.[Conclusions] The experiments demonstrate that the Scientific Prompt strategy can significantly enhance the performance of large language models in short text, multi-domain entity extraction tasks without requiring parameter updates. Its effectiveness in unsupervised scientific short texts is also validated, enabling accurate extraction of scientific entities to monitor technological trends. This research provides an important reference for knowledge entity extraction in general scientific short text tasks.

  • Zhang Xiaojuan, Ji Ruyi
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2025.0018
    Online available: 2025-07-04

    [Objective]This paper proposes a global citation recommendation framework based on both static and dynamic heterogeneous graphs, aiming to enhance the accuracy of citation recommendations.[Methods]This paper first constructs static weighted heterogeneous networks and temporal heterogeneous networks separately. For the static heterogeneous network, the mixed random walks and the skip-gram model are used to generate the embedded representations of nodes, which can capture the local and global network information. For the temporal network, the meta-path instances are first generated based on the meta-path-based random walk, and then the temporal evolution process is modelled in the heterogeneous graph to generate the embedded representations of nodes in the graph. Then, the final embeddings of paper nodes are produced using joint and separate training methods. Finally, candidate citation lists are generated for input papers by calculating the similarity between the final embeddings of paper nodes.[Results]Experimental results show that: the experimental performance of the proposed methods outperform those that only consider dynamic or static information of network; the independent training method performs best in terms of almost all recall metrics (except for recall @40); the uncertainty-based multi-task weighting method achieves the best performance in terms of MRR and MAP metrics, with values of 0.308 and 0.297.[Limitations] The performance of the newly proposed model hasn't been verified across multiple datasets. The running efficiency of the model still needs to be further optimized.

    [Conclusions]Considering both the static and dynamic aspects of the network can effectively enhance the performance of global citation recommendation.

  • Xu Jianmin, Wang Li, Zhang Xiongtao
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0730
    Online available: 2025-07-04

    [Objective]The existing social sequential recommendation research is easy to introduce friend information that is dissimilar to the user's interests, and fails to consider the difference in degree of social influence of different users, resulting in limited recommendation performance. In order to make up for the shortcomings of existing researches, an adaptive social sequential recommendation method based on graph attention network is proposed. [Methods]First, the self-attention mechanism is used to model the user behavior sequence and obtain the user's dynamic interest representation. Secondly, a regularization strategy is designed to constrain the graph attention network to aggregate all friend features to accurately model user social interest representation. Finally, an attention-based adaptive fusion method is proposed to accurately integrate dynamic interests and social interests to generate recommendation results. [Results]Compared with the mainstream baseline models, the proposed method can achieve up to 10.8% improvement on HR@10 and 5.3% improvement on NDCG@10. [Limitations]The proposed method has a high dependence on the structure of social networks, and its performance improvement is not significant when the social relationship data is sparse. [Conclusions]The proposed method enables the utilization of social information more comprehensively, predicts user behavior effectively, and improves recommendation performance.

  • Xing Bowen, Chai Mengdan, Xiang Zhuoyuan
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0792
    Online available: 2025-07-04

    [Objective] In view of the fact that the abstract of judicial decisions requires consistency with the original text in terms of the facts of the case, the application of law and other elements, we propose a method for generating abstracts of Chinese judicial decisions embedded in the factual consistency assessment of judicial elements.[Methods] Firstly, we define the principles and methods for determining factual consistency of judicial decision summaries; secondly, we determine the preprocessing processes such as data addition, factual consistency error correction and assessment; then, we construct segmented the segmented extraction model and the generative summary model that introduces the knowledge graph of judicial elements, respectively, and carried out the experiments on the CAIL2020 dataset.[Results] The summaries generated by the FC-JDSM model were 67.98%, 55.40%, 64.14%, 78.5%,and 90.01% on the metrics ROUGE-N (N=1, 2, and L), SRO,EM-FCJS, respectively, which were better than the comparative models. The ablation experiments confirm the effectiveness of chunk extraction and factual information introduction.[Limitation] The data obtained from the data enhancement program in EM- FCJS has some deviation from the real data.[Conclusion] Incorporating judicial elements into the consistency assessment and abstract generation process improves the consistency of abstracting Chinese judicial judgment instruments, which is conducive to the impartiality of judicial work.

  • Congjing Ran, Qunzhe Ding, Yonghui Song, Fuxin Wang
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1171
    Online available: 2025-07-04

    [Objective] To address the challenge of distinguishing substantive patent transactions in patent transfer data, this study proposes a systematic approach that integrates multiple methods based on the Levenshtein distance algorithm. This approach effectively identifies substantive patent transactions and explores their technical characteristic differences.

    [Methods] A screening process is proposed for different patent transfer scenarios. One of the key steps involves using multiple text similarity algorithms based on Levenshtein distance to calculate the similarity scores of the names and addresses of the parties involved in the transaction. These scores are then combined with a set threshold to exclude non-market-based transaction records related to internal resource reallocation. At the same time, the accuracy of the method is validated through empirical research, and statistical analysis is used to compare the differences in technical indicators across different transaction types.[Results] The experimental results show that this method achieves an accuracy of 81.27% and is effective in identifying patent behaviours that involve substantive transactions. Patents that undergo substantive transactions have significantly higher technical indicators such as the number of independent claims, the number of family patents, and the number of times cited compared to patents that do not undergo substantive transactions (p < 0.05).[Limitations] The dataset's temporal scope is restricted, and the model's adaptability to handle complex address structures requires further refinement to improve generalizability.[Conclusions] This study establishes an effective and scalable methodology for classifying substantive patent transaction behaviours, offering valuable data support for advancing research in technology transfer and patent commercialization.

  • Li Yihong, Yu Yanfang, Yu Qiwei, Li Sujuan, Zhang Shaolong, Ye Junjun
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0239
    Online available: 2025-07-04

    [Objective]Large language generation models have brought new ideas to the Chinese open relation extraction task, but how to optimize the quality of the relation extraction results generated by the model has become an important issue.[Methods]This paper proposes a low-cost large model fine-tuning method based on multi-dimensional self-reflective learning enhancement (SRLearn). It automatically guides the model to engage in multi-dimensional self-reflective learning, thereby optimizing the model's Chinese relationship extraction generation quality.

    [Results]Compared to the LoRA+DPO fine-tuning method, the SRLearn method improves performance by 15 percentage points on the WikiRE1.0 dataset and 6.5 percentage points on the DuIE2.0 dataset, validating the effectiveness of this approach.[Limitations]The SRLearn method needs to consider covering more generation quality issues in the future.[Conclusions] The large model fine-tuning method based on Multidimensional self-reflection learning can greatly improve the generation quality of Chinese relation extraction.

  • Su Yanyuan, Dong Xiaoyu, Han Cuijuan, Zhang Yaming
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1157
    Online available: 2025-07-04

    [Objective] A federated learning framework embedded with dual-channel attention convolution is designed. It could solve the difficult problem of cross social networks feature extraction caused by privacy protection restrictions, and identify social robot accounts accurately.[Methods] Firstly, the federated learning framework is adopted to realize data integration of cross social networks. Secondly, the dual-channel attention convolution mechanism is introduced into the local model module to comprehensively mine data features. Thirdly, with the help of basic convolution neural network and blockchain, the local model parameters are integrated in the federated aggregation module to obtain and securely store the optimal model parameters.[Results] The experimental results on the TwiBot-20&Weibo-bot dataset show that the accuracy rate, precision rate, recall rate and F1 value of FL-DCACNN model reaches 91.63%, 97.10%, 97.14% and 96.88%, respectively, and show strong generalization ability.[Limitation] The multi-modal feature extraction only considers the structured data, text data and picture data, but does not involve the video and audio data. [Conclusions] FL-DCACNN model could effectively solve the problem of poor recognition effect of social robots caused by insufficient feature extraction and single data source due to data privacy, so as to further improve model recognition performance and realize accurate recognition of social robots.

  • Zhong Ming, Qian Qing, Zhou Wei, Wu Sizhu
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2023.0461
    Online available: 2025-07-04

    [Objective] In view of the characteristics of centralized storage, data security risks, limited computing resources, and urgent user analysis and utilization needs of the National Population Health Data Center (NPHDC), this study explores the construction ideas suitable for the data enclave of NPHDC, so as to provide users with a more efficient, secure, and flexible data processing and analysis environment.[Methods] The types, characteristics, implementation mechanisms, and applicability of different scenarios of data enclaves were summarized. Combined with the data application characteristics of NPHDC, a big data analysis platform for NPHDC was built based on the virtual enclave method integrating of security enhancement, micro-isolation, and artificial intelligence technologies.[Results] The big data analysis platform supported services such as data review, data processing, data analysis and mining, and peer review of the data associated with the user's published papers in NPHDC. It has completed the review tasks of 32,000 datasets of more than 2,800 projects, more than 10,000 data analysis tasks, and more than 5000 data processing tasks, with a data leakage rate of 0% and a resource utilization rate of 80%.[Limitations] It is not possible to realise cross-institutional data sharing with decentralized storage, and it is necessary to explore data enclave research combining privacy-preserving technologies such as multi-party secure computing and federated learning in combination with the development of NPHDC.[Conclusions] It is of great significance to effectively solve the needs of safe sharing and collaborative analysis of population health data centralisation, which is of great significance for the security and sharing and utilisation of national population health scientific data.

  • Yi Haohan, Wang Hao, Zhou Shu, Zheng Xuhui, Zhou Zhengda
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2025.0001
    Online available: 2025-07-04

    [Objective] To address the challenges of named entity recognition (NER) in ancient texts caused by linguistic complexity, diversity, and the scarcity of annotated data.[Methods] We propose a novel RAG-LATS framework that integrates a knowledge base of ancient texts with AI-Search-driven retrieval-augmented generation (RAG). By incorporating the generation, retrieval, reflection, and revision mechanisms of the LATS framework, we enhance the zero-shot NER performance of large language models in the domain of ancient texts.[Results] Experimental results on the CHisIEC public dataset demonstrate that our method outperforms domain-specific fine-tuned models. Specifically, it achieves a 14.44 percentage point improvement in Micro F1 score compared to the Xunzi-Qwen1.5-7B_chat model, and a 16.99 point improvement over the general-purpose Qwen1.5-7B_chat model. [Limitations] The prompt construction method needs further optimization. The computational complexity of the LATS framework may affect efficiency in large-scale data scenarios.[Conclusions] Retrieval-augmented generation effectively enhances the domain knowledge of large language models, while the LATS framework optimizes the accuracy and coherence of model outputs. Together, these advancements significantly improve the performance of large language models in zero-shot NER tasks for ancient texts.

  • Li Hongmin, Yang Wenhao, Ma Hongyang, Wang Jianzhou
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1180
    Online available: 2025-07-04

    [Objective] Considering the non-completeness of urban carbon emission information, the multiplicity of characteristics and the complexity of emission patterns, a comprehensive portrayal of the complex dynamic process of carbon emission is crucial for the improvement of forecasting accuracy.[Methods] A multi-source heterogeneous time-domain convolutional carbon emission forecasting HOSVD-TCN model fusing key information granularity is proposed. Firstly, the original granularity information is captured using automatic extraction techniques, and secondly, the physical text of social media is processed using natural language to form the sentiment values of key information granularity. High-quality tensor representations are generated by high-order singular value decomposition and reconstruction of heterogeneous information, and the reconstructed carbon emissions are used as inputs to the forecasting model. Finally, the time domain convolutional model TCN is used to forecast the carbon emissions.[Results] The experimental results show that the average MAPE value of the three cities of the proposed model is only 6.96%, and the forecasting performance is better than that of other mainstream comparison models.[Limitations] The complexity of multimodal data processing is high and the forecasting effectiveness is limited by the size of the available dataset.[Conclusions] HOSVD-TCN fully combines the feature extraction capability of HOSVD and the spatio-temporal capture capability of TCN, realizing the accurate forecasting of urban carbon emission, and providing powerful technical support and scientific basis for urban planning and management.

  • Zhang Zhengang, Yu Chuanming
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2025.0049
    Online available: 2025-07-04

    [Objective] Based on modeling papers and their attributes (e.g., authors, publication venues) as a knowledge graph, this study aims to enhance the performance of citation count prediction for newly published papers by fine-grained aggregation of the temporal evolution of paper attribute data.[Methodology] We propose a citation count prediction model that incorporates temporal evolution and fine-grained information aggregation. The model consists of four key modules: (1) a graph neighborhood feature aggregation module, which extracts feature representations of academic entities in the knowledge graph; (2) a temporal evolution representation module, which captures the temporal dynamics of paper attribute data; (3) a fine-grained information aggregation module, which leverages a multi-head attention mechanism to aggregate the influence of different attributes on papers; and (4) a prediction module, which outputs citation count predictions. The proposed model is evaluated on the DBLP dataset through empirical studies.[Results] On the DBLP dataset, the proposed model achieves MALE, RMSLE, and R² scores of 0.5141, 0.7098, and 0.3470, respectively, significantly outperforming existing state-of-the-art methods.[Limitations] Due to space constraints, this study only evaluates the model on the DBLP dataset. Future work will focus on validating the model's generalizability across additional datasets.[Conclusion] The proposed model demonstrates superior performance compared to state-of-the-art methods on the DBLP dataset. This study highlights the effectiveness of leveraging the temporal evolution of paper attributes and fine-grained information aggregation to improve citation count prediction for newly published papers.

  • Zhao Guangyu, Duan Yongkang, Geng Qian, Yan Yan, Jin Jian
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2025.0214
    Online available: 2025-07-04

    [Objective] Existing pre-trained models exhibit embedding anisotropy and limited domain generalization in government question retrieval, resulting in issues such as low recall, incomplete coverage, reduced accuracy, and suboptimal user experience. To enhance the effectiveness and efficiency of government question retrieval, this paper presents GovSQR, a fine-grained government similar question retrieval model. [Methods] GovSQR leverages structured prompt engineering and few-shot examples to guide a large language model in generating task-specific positive and negative samples. The RoBERTa model is subsequently fine-tuned using supervised SimCSE on the generated triplet data. A dynamic weighted masking mechanism and debiased contrastive loss function are introduced to reduce false negative interference in semantic representations. [Results] Evaluation on a Shenzhen government question dataset shows GovSQR achieves P@1, R@3, and MRR scores of 0.9660, 0.9811, and 0.9729, respectively, outperforming leading contrastive learning models such as InfoCSE and DiffCSE. [Limitations] The data generation process is prone to hallucination, necessitating costly manual verification. Additionally, the model's efficacy on semantically complex or ambiguous queries remains to be further validated. [Conclusions] By combining data augmentation with false negative debiasing, GovSQR learns more discriminative and uniformly distributed embeddings, significantly improving government similar question retrieval accuracy and effectively supporting intelligent government services.

  • Wang Xing, Yuan Weihua, Meng Guangting, Chen Yu, Zong Chen
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1221
    Online available: 2025-07-04

    [Objective]To disentangle users’ diverse intents and capture rich node information in bundle recommendation, this paper proposes a bundle recommendation model based on disentanglement-aware dual-channel contrastive learning. [Methods]The local multi-view intention disentanglement module maps node representations to the latent space to obtain disentangled representations. The global hypergraph unified learning module integrates multi-type data and captures high-order correlations. The dual-channel collaborative learning module utilizes contrastive learning to achieve collaborative learning between them. [Results]On public datasets, D2CBR demonstrates significant performance advantages. Compared with the state-of-the-art baselines, the average performance improvement reaches 2.87%, with a maximum of 6.43%. [Limitations] Hypergraph operations, such as the incidence matrix, are often related to the number of nodes in the graph. When processing extremely large-scale datasets, they may lead to relatively large memory and computational overheads, since they are related to the number of nodes in the graph and their application may be limited in scenarios with limited computation resources. [Conclusions]In this paper, the graph variational autoencoder is effectively used to distinguish diverse user intentions, and the hypergraph is utilized to integrate multi-type data, which significantly improved the recommendation performance. The performance surpasses that of the state-of-the-art baselines on public datasets, demonstrating the effectiveness and robustness of the model.

  • Tong Xin, Lin Zhi, Yuan Lining, Wang Jingya, Jin Bo
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2025.0251
    Online available: 2025-07-04

    [Objective] This paper proposes an agent framework to enhance the accuracy and interpretability of risky instruction mining for large language models. [Methods] The framework integrates a language alignment module for unified mapping of multilingual inputs, a hierarchical detection module for multi-stage risk analysis, a dual-channel explanation module to support decision-making, and a consistency verification module to improve reliability when handling complex samples. [Results] Experiments on three risky instruction datasets demonstrate that the proposed method can improve the analysis accuracy of existing tools from 54.75% to as high as 93.75%. Even when using only lightweight open-source models as the core, the accuracy gain exceeds 20%. [Limitations] The inference efficiency of the framework needs improvement, and the structured output generated by some lightweight models lacks stability.[Conclusions] The proposed method provides an effective, interpretable, and cross-lingual enhancement solution for risky instruction mining in large language models.

  • Sun Ran, An Lu, Xie Zilin
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1218
    Online available: 2025-07-04

    [Objective] Fine-grained mining of social media users’ opinion shifting in uncertain environments helps comprehensively understand the development of public opinion.[Methods] This study focuses on Twitter users who actively participated in vaccine topic. We build stance detection model on pre-trained language models and neural networks models, and categorize user opinion shift paths into six types. Based on the uncertainty reduction theory, a feature system for predicting opinion shifting is constructed. An opinion shifting prediction model is built using the XGBoost method, and the feature importance is analyzed using the SHAP interpretation method. [Results] The study results show that 46.76% of users have not changed their vaccine stance during the observation period, and the proportion of users experiencing opinion reversal is relatively low. The opinion shifting prediction model built on XGBoost achieved an F1 score of 0.8209, with the feature of interaction user stance similarity being the most important. Moreover, the importance ranking of features on different opinion shifting paths is different.[Limitations] User stance shifting can be influenced by multiple factors, including significant exogenous events. Future work could further explore the impact of such factors on user opinion shifting.[Conclusion] Combining pre-trained language models and neural network models better detects user stances. This paper reveals the factors influencing user opinion shifting in uncertain environment, providing assistance for further work in online monitoring of social media user opinions.

  • Xiang Shuxuan, Mao Jin, Li Gang
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2025.0096
    Online available: 2025-07-03

    [Objective] Existed methods overlooked the relation between commercialization potential and patenting strategy in proxy selection, features construction and model structure design. This article proposes a new method for predicting patent commercialization potential. [Methods] The maintenance is applied as the proxy for patent commercialization potential, and LSTM+MTNN model is proposed. The model is comprised of feature processing module and multi-task predicting module. Feature processing module uses Bert+SimCSE and LSTM to get refined continuous feature of patent claims, and concatenate it with numerical features, as the input of multi-task predicting module. Multi-task predicting module is constructed based on the connections of legal events and commercialization potential, and it is formed by merging the shared bottom, the commercialization potential prediction tower and legal event prediction tower, the final output includes result of legal event prediction and commercialization potential prediction. [Results] The experimental results show that the selected numerical characteristics are effective and functional for commercialization potential prediction. Besides, LSTM+MTNN can achieve better performance on accuracy, precision and F1 score than baseline models on three datasets. [Limitations] The utilization of patent text still needs further research, and the methods of representing and predicting patents’ commercialization potential under changing technology environment are to be explored. [Conclusions] Besides numerical characteristics, LSTM+MTNN adds continuous characteristic of patent claims for the input, which enriches input information. LSTM+MTNN utilize the inner connections of legal events and patent commercialization potential with the multi-task structure, which enables the model to learn the connections between two tasks. Both mentioned techniques are proved helpful for model optimization, and make the method proposed in this article functional for patent commercialization potential prediction.

  • Ma Jie, Sun Wenjing, Hao Zhiyuan
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0938
    Online available: 2025-07-03

    [Objective] The purpose of this study is to build a high-quality disease prediction model and explore its interpretability. By identifying the key incentives affecting the formation of disease and further analyzing its mode of action on disease, aiming to enable auxiliary diagnosis and precision medicine. [Methods] Taking obesity as the research object, firstly, the random forest model is used to select the most representative features of these disease data; secondly, proposing an enhanced sparrow search algorithm to adaptively obtain the nuclear parameters and penalty coefficient of SVM; then, the optimized SVM model is used to predict and analyze the data samples, and compared with 8 baseline methods; finally, SHAP interpretation framework is utilized to conduct the quantitative analysis of the relationship between the disease incentives and the disease.[Results] The prediction accuracy of this proposed model can reach 85.5%, moreover, the value of accuracy, specificity and Mathews correlation coefficient obtained by the proposed model are all higher than that of others, which proves the effectiveness of the model. In addition, family history, vegetable intake frequency, daily meals, height, gender, transportation usage and high calorie food intake are the key factors affecting the formation of obesity.[Limitations] An empirical study using obesity as an example cannot effectively verify the generalizability of the proposed model; the interaction between the characteristic variables are not analyzed.[Conclusions] The model proposed in this paper not only can have superior prediction accuracy, but also can analyze the effect degree and effect direction of the disease incentives, which can provide decision support for medical institutions.

  • Zhang Borui, Yang Ning, Zhang Xin, Wen Yi
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0549
    Online available: 2025-07-03

    [Objective]This study provides a comprehensive overview of scientific datasets recommendation, aiming to establish a theoretical foundation for scientific datasets sharing research.[Coverage] A search was conducted in CNKI, WOS and Google Scholar using keywords such as "scientific data recommendation" and "scientific dataset recommendation". Through thematic and snowball searches,71 key articles were identified.[Methods]A systematic literature review and synthesis approach were utilized to assess the existing research. This study provides a comprehensive overview and critical analysis from three perspectives: recommendation models, evaluation metrics, and future prospects.[Results]Scientific datasets recommendation are found to play a critical role in scientific datasets sharing, with prevalent methods including content filtering, collaborative filtering, graph models, and hybrid filtering. Identified research gaps include the synthesis of multi-source heterogeneous data, user privacy protection, the development of explainable systems, and the evaluation of recommendation.[Limitations]This paper provides an overview of mainstream research, focusing on key studies in the field. Due to the inherent diversity of scientific data types, it is not feasible to enumerate every individual study.[Conclusions]Future research directions are identified as the integration of multi-source heterogeneous information, improving recommendation explainability, ensuring privacy protection and refining evaluation methods.

  • Ni Yuan, Li Xiangyu, Zhang Jian, Dong Feixing
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2025.0074
    Online available: 2025-07-03

    [Purpose] Constructing interpretable ensemble learning models to provide a new decision-making approach for predicting the development effectiveness of movie IP derivatives. [Method] Based on the value chain theory, analyze the development process of movie IP derivatives and construct a predictive indicator system. Extract and screen influencing factors based on KLLB model, and construct predictive labels. Propose a development performance prediction model based on AWStacking.

    [Results] The AWStacking algorithm with XGBoost, CatBoost, RF as base learners and LR as meta learner has the best prediction performance, with a macro average accuracy of 0.8699, macro average recall of 0.7889, and macro average F1 value of 0.8216.[Limitations] Due to the limitations of current data availability, the indicators for measuring the development effectiveness of movie IP derivatives can be further optimized to improve the granularity of indicator measurement.[Conclusion] The constructed model provides a basis for judging and predicting the development effectiveness of film IP derivatives, contributing to the healthy development of the film IP derivative market.

  • Li Guang, Wu Xinnian, Ning Baoying
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1067
    Online available: 2025-07-03

    [Objective] The topic temporal diffusion network model between multi-source data is designed to detect the research frontier of dynamic measurement data source weight. [Methods] By analyzing the time, diffusion and network characteristics of frontier topics, a research frontier detection framework, method, index system and three-dimensional discriminant coordinate map based on topic temporal diffusion network were proposed. Finally, an empirical analysis was carried out in the field of artificial intelligence. [Results] The weights of multi-source data were quantitatively calculated : strategic planning 0.301, scientific and technological report 0.234, fund project 0.124, patent literature 0.122, conference paper 0.113, journal paper 0.105, and 16 emerging and 4 growth research frontier topics were identified. [Limitations] This model is only empirically tested in the field of artificial intelligence in the United States, and needs to be verified by more countries and fields. [Conclusions] The proposed method based on the topic temporal diffusion network between data sources can effectively identify the research frontiers in the field.

  • DUAN Yongkang, ZHAO Guangyu, GENG Qian, CAO Hanwei, JIN Jian
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1040
    Online available: 2025-07-03

    [Objective] Existing policy analysis methods rely on a large number of manual annotations and alignment comparisons, resulting in inefficiency and error-prone. This study aims to improve the efficiency of policy information retrieval by constructing a structured policy knowledge base, realize intelligent analysis and comparison of policies, and provide accurate decision support for policy formulation. [Methods] Taking enterprise-friendly policies as an example, this study proposes a framework based on large language models for efficiently comparing related policies. The framework includes the following steps: 1) knowledge base construction; 2) retrieval and storage; and 3) answer generation. [Results] By validating the dataset of enterprise-friendly policies in China, Beijing, Shanghai and Shenzhen, the framework proposed in this paper automatically integrates multiple policies and can analyze the semantics of the policies to realize the construction of the database and help complete the policy matching and analysis. The Chroma-RAG model of this study demonstrates significant advantages, reaching 60% in Hit@1 index, 76% in Hit@3 index, and 71.13% in MRR index. In the comparison of retrieval methods, the model of this study outperforms the traditional models such as Tf-idf, Word2Vec, USE, BERT, SBERT, DPR, SimCSE, etc., which highlights the superiority of this paper's method. The framework and retrieval methods proposed in this study have significant advantages through comparative experiments on evaluation metrics such as Hit@1, Hit@3, and MRR. [Limitations] The study is mainly based on cross-sectional data, which cannot comprehensively reflect the dynamic changes in the process of policy implementation and limits the in-depth analysis of policy effects. [Conclusions] Knowledge base construction and policy comparison based on big language modeling can effectively improve the intelligent analysis and comparison effect of policy texts, especially in the construction of policy knowledge base and policy comparison support for policy makers to provide significant decision support effects.

  • Li Jie, Zhang Zhixiong
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0529
    Online available: 2025-07-03

    [Objective] The image clustering method DCCM based on deep comprehensive correlation mining is limited to clustering based on sample semantic features, and cannot fully utilize the highly discriminative inter class structural relationships contained in the cluster structure features, which restricts the further improvement of DCCM clustering performance.[Methods] This paper proposes an improved model, Improved DCCM, that integrates cluster structure features. Firstly, using DCCM as the basic clustering model, a text data augmentation strategy based on Gaussian distribution is introduced to inherit the sample semantic feature mining ability of DCCM. On this basis, the weighted sum of mutual information loss between sample variables and cluster variables, as well as DCCM original loss, is used to jointly learn sample semantic features and cluster structure features. [Results] The superiority of the improved IDCCM model was validated through experiments on publicly available standard datasets and scientific paper abstract datasets. The clustering accuracy of the improved model on the public datasets 20NewsGroups and Reuters-60k increased by 9.8% and 7.32% respectively compared to the benchmark model. [Limitations] It is necessary to specify the number of clusters in advance, but in practical applications, it is often difficult to determine the optimal number of clusters for the original data, and it should be adjusted appropriately according to the specific data situation.[Conclusions] The IDCCM model can explore and utilize cluster structure features, which improves the clustering performance of the DCCM model.

  • Zhou Jie, Wang Dongyi, Dai Qinquan, Xia Sudi
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0939
    Online available: 2025-07-03

    [Objective] This study aims to explore universal effective prompt strategies for generative AI to enhance user interaction skills and optimize user experience. [Methods] The Q-methodology was used to invite participants to rank the effectiveness of various prompt strategies based on their experiences with generative AI in general scenarios, across tasks, and across models, in order to identify universal effective prompt strategy types. [Results] The study identified that the most effective prompt strategies include clearly defining the question, specifying the goal, and providing background information. Universal effective prompt strategies can be classified into three types: explicit needs and precise guidance type, clear explanation and logical sorting type, and task decomposition and diversified expression type. [Limitations] The data was sourced from Chinese users; future research could expand to include users from different cultural backgrounds to verify the broader applicability of the findings. This study focuses on an overall context analysis, and future research could further examine the differences in prompt strategies under specific scenarios, task types, and model conditions. [Limitations] This study provides valuable insights for optimizing generative AI and enhancing user interaction skills.

  • Xu Mengyao, Sun Bin, Jiang Tao, Cui Jiahao
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1060
    Online available: 2025-07-03

    [Objective] Aiming at the problem of insufficient consideration of the overlapping characteristics of node locations and communities in rumor suppression, a rumor suppression framework, RSM-OC, is proposed. [Methods] The framework innovatively proposes to use the central value of trust to accurately identify key nodes, combine overlapping nodes to form a candidate seed set, and finally optimize the set of positive seed nodes using genetic algorithms, and simulate the game of rumor vs. truth using a linear threshold model with one-way state transitions. [Results] Experiments on four real datasets show that the RSM-OC method improves the rumor suppression rate by 23.3% on average compared to the baseline algorithm, and expands the truth propagation range by two times on average, which performs especially well in dense and medium-sized networks. [Limitations] The RSM-OC method has high computational cost in large-scale networks and may have performance bottlenecks. [Conclusion] The RSM-OC method has good effectiveness in both rumor suppression and truth propagation range expansion.

  • Bu Wenru, Wang Hao, Zhou Shu, Shi Bin, Zhao Meng
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1123
    Online available: 2025-07-03

    [Objective] To explore and validate the potential of digital technology in enhancing the depth and breadth of character analysis in literary works by proposing a multidimensional personality calculation and analysis framework based on narrative text reconstruction. [Methods] The research process includes text reconstruction, personality quantification, model construction and personality analysis. Firstly, the text information is extracted through machine translation, denotation disambiguation and other techniques; secondly, the large language model is used to obtain the character personality description and construct the personality dataset; then the deep learning framework LBA is used to construct the personality detection model; finally, the numerical computation and analysis of the multi-dimensional personality is completed. [Results] In text reconstruction, the automated extraction scheme proposed in this paper demonstrates an accuracy rate exceeding 89% for main character extraction and an f1 score exceeding 74%. The text content disassembly effectiveness is evidenced by an average Rouge-L score of 73.01% across various text types. The MSE index of the constructed personality detection model, MPNDM, is reduced by 29.08% and 8.72% compared to two benchmark models, respectively. The personality analysis of all characters and representative figures in Romance of the Three Kingdoms reveals the distinctions and variations in personality among character groups and individuals. [Limitations] Given the diversity of theories and models for character personality measurement, introducing different theoretical models may yield varying results. [Conclusion] This study not only validates the effectiveness of the proposed framework but also pioneers a new avenue for the application of digital humanities in the realm of literature.

  • Zhao Yiming, Liu Shunsheng, Lv Lucheng
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1047
    Online available: 2025-07-02

    [Objective] By analyzing patent data in key technological fields, early identification of technology patents with high disruptive potential can be achieved. [Methods] Building an early identification index system for disruptive technologies based on the technology lifecycle theory, using patent data in the field of quantum computing in the Smart Sprout patent database as the research object, and constructing an ensemble learning model to identify technology patents with high disruptive potential in this field at an early stage. [Results] Through the BERTopic topic modeling framework, five disruptive research directions were identified: quantum encryption technology, quantum processors, superconducting quantum bits, semiconductor technology, and quantum neural networks. The effectiveness and feasibility of the proposed method were verified. [Limitations] Empirical analysis only focuses on the field of quantum computing and fails to comprehensively cover different key technological areas; The framework construction and indicator extraction rely solely on patent data, which can expand the types of supporting data sources. [Conclusions] This study helps to identify high disruptive potential technology patents early on, and then analyze the main disruptive research directions, providing a basis for the formulation and implementation of major national science and technology strategies.

  • Feng Ling, Pan Yuntao
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1039
    Online available: 2025-07-02

    [Objective] This paper aims to identify interdisciplinary literature within a dataset, in order to accurately grasp the cutting-edge trends in interdisciplinary research. [Methods] We propose an interdisciplinary literature identification method based on Graph Neural Network (GNN). By selecting representative literature, we train a multi-label classification model based on GNN for the purpose of identifying interdisciplinary literature. [Results] With only 5% of the dataset labeled as representative literature, the proposed method achieves an AUC value of up to 0.843 for interdisciplinary literature identification across the entire dataset. [Limitations] The proposed method still needs improvement for the identification of multidisciplinary literature; The effectiveness of the proposed method in identifying interdisciplinary literature on large domains and large-scale dataset still needs further validation; [Conclusions] The proposed method not only demonstrates good performance in identifying interdisciplinary literature but also effectively addresses the challenge of scarcity in labeled training datasets.

  • Cao Yinni, Han Hu, Huang Mingwei, Liu Jinde
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.1114
    Online available: 2025-07-02

    [Objective]In order to reduce the cross-modal semantic divide and enhance aspect-related image feature extraction, this paper obtains fine-grained cross-modal sentiment representation from global and local perspectives, and proposes a graph convolutional network model for multi-perspective fusion representation.[Methods]First, the text and image descriptions are jointly encoded from a global perspective and combined with a multi-head self-attention mechanism to capture cross-modal global semantic features. Second, two graph structures are constructed to mine the fine-grained sentiment information of text and images from the local perspective. Syntactic dependency graph is introduced through the text graph structure to enhance text syntactic feature extraction. In the fusion graph structure, null convolution is used to expand the sensory field to extract key information in image chunks and enhance the feature association across chunks, and multiple cross-attention is utilized to guide the model to focus on image features related to aspect words. Finally, aspect-level sentiment analysis is performed by combining global and local fine-grained sentiment information.Results]The accuracy and F1 values of this paper's model are higher than the baseline model on both Twitter-2015 and Twitter-2017 datasets. The Acc and F1 values improved by 0.44% and 1.51% on the Twitter-2015 dataset, respectively, compared to the suboptimal model. On the Twitter-2017 dataset, the Acc and F1 values improved by 0.54% and 0.72%, respectively.[Limitations]Failed to validate the generalization of the model in this paper on a larger dataset.[Conclusions]The proposed model in this paper can effectively reduce the semantic gap between modalities and fully extract the image features related to aspect words, which improves the effect of sentiment classification.

  • Zhang Dongyu, Zhuang Mulin, Jin Senyuan, Liu Xinyue
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0450
    Online available: 2025-07-02

    [Objective] In view of the fact that a large number of current mental illness detection studies fail to fully consider the key role of metaphorical information in the disease identification process, a mental illness monitoring method based on metaphorical information and instruction tuning is proposed. [Methods] The core of this method is to introduce metaphor information through metaphor identification technology, which includes the analysis of the frequency of metaphor use and the correlation between entities in the metaphor. In addition, a large language model is used to capture symptom and emotion information, and these features are integrated to build an instruction set to effectively train the model. [Results] The model achieved F1 scores of 85.82% and 75.47% on the Twitter-Depression and MVSA datasets, surpassing the baselines by 2.01% and 1.49%, respectively. [Limitations] Various information extracted based on large language models may be affected by model illusion and have inaccuracies, requiring more accurate extraction methods. [Conclusions] The importance of metaphor information in mental illness detection is confirmed, and it can provide rich information for mental illness detection models.

  • Wen Xiaobo, Hua Bolin
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0803
    Online available: 2025-07-02

    [Objective]To achieve identification of the application domains involved in artificial intelligence patents.[Methods]Under the framework of metric learning, a dual encoder based on BERT is used to encode patent text and application domain annotated text separately, obtaining encoding results that can characterize the application domain of artificial intelligence patents to complete the recognition task.[Results]In artificial intelligence patent application domain’s multi classification testing, an accuracy of 0.947 was achieved , and a multi-level clustering system with a silhouette coefficient of 0.36 was obtained in artificial intelligence patent application recognition.[Limitations]Although general quality annotated data can be obtained through large language models, higher quality annotated data is not easy to obtain, and there is significant optimization space for the metric learning framework and encoder used.[Conclusions] Metric learning can be used to identify the application domains of artificial intelligence patents in a targeted manner and inspire optimization of unsupervised topic recognition.

  • Zhang Zhipeng, Zhang Liyi
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0591
    Online available: 2025-07-02

    [Objective] From the perspective of the audience attributes of short video advertisements, the psychological and demographic attributes of the audience are extracted to explore their impact on audience engagement. [Methods] Based on self-efficacy theory, data mining and deep learning techniques are employed to construct variables representing the audience's psychological and demographic attributes. A multiple regression model is then utilized to analyze the influence of these variables on audience engagement as well as the moderating effect of product types. [Results] The study reveals that audience perception of advertisement disclosure, the proportion of negative enthusiastic comments, the proportion of female audiences, and the representation of Generation Z, middle-aged, and older adults all exert varying degrees of influence on audience engagement. Furthermore, the product type moderates the main effects observed. [Limitations] The measurement of audience engagement is relatively simplistic, and it could be further enriched by incorporating audience viewing and purchasing data. [Conclusions] Both psychological and demographic attributes of audiences significantly impact their engagement, and these effects are moderated by product type.

  • Bai Yu, Wang Lianji, Liu Xiang, Yuan Jinfu, Zhang Guiping
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0642
    Online available: 2025-07-02

    [Objective] To enhance the performance of multimodal named entity recognition, this paper proposes a method that filters out irrelevant visual regions by calculating the semantic relevance between the anchor text of entities and the image regions, thereby achieving the goal of eliminating visual noise. [Methods] Using prompt words instead of category terms as entity anchor texts for the semantic relevance assessment of visual regions, the impact of irrelevant visual areas on entity recognition is mitigated by reducing their weights. This approach employs a multi-layer interactive Transformer for the fusion of textual and visual modalities, with entity recognition being realized through a CRF layer. [Results] Experimental results on public benchmark datasets demonstrate that the proposed method achieves F1 scores of 76.97% and 88.88% on Twitter15 and Twitter17, respectively, representing improvements of 0.48% and 1.17% over state-of-the-art approaches. [Limitations] The proposed method is based on a supervised learning paradigm, and its performance is influenced by the quality and quantity of annotated data. This study focuses on named entity recognition tasks using publicly available benchmark datasets. Future work will investigate the transferability of the model. [Conclusions] Eliminating visual noise can effectively enhance the performance of multimodal named entity recognition; filtering out irrelevant visual regions can be achieved by calculating the semantic relevance between the entity anchor text and the image regions.

  • Wu Yifan, Ma Songjie, Li Shuqing
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0916
    Online available: 2025-07-02

    [Objective] Perceiving the preferences of users and their friends for products that are at a certain stage of popularity, to achieve more accurate recommendations. [Methods] First, calculate the project popularity that integrates contribution and influence, use attention mechanisms and recurrent neural networks to capture personal popularity preference representations, and use convolutional networks and graph attention mechanisms to obtain long and short-term popularity preferences of friends. [Results] Comparative experiments are conducted on the Douban dataset, Deliciouse dataset, and Yelp dataset. The proposed model outperforms the second-best model DGRec in all evaluation metrics, with the highest recall@20 improvement of 13.03% and the highest NDCG improvement of 11.58%. The proposed popularity calculation method also achieves significant improvements compared to traditional methods, with the highest recall@20 improvement of 11.53% and the highest NDCG improvement of 10.29%. [Limitations] The performance of this model is relatively not strong when handling short sequences. [Conclusion] This model enhances the representation of user popularity preference and social popularity preference, improves the expression capability of the weight of each interaction for prediction results, and provides exposure opportunities for long-tail items. The code is available at https://github.com/msj1010/SPPSRec_Pytorch.

  • Yu Yuhai, Xing Zhiqi, Meng Jiana, Gao Linlin, Wang Bolin
    Data Analysis and Knowledge Discovery. https://doi.org/10.11925/infotech.2096-3467.2024.0891
    Online available: 2025-07-02

    [Objective]In the era of rapid internet expansion, people can express their emotions in various forms on digital platforms, making multi-modal sentiment analysis a research hot-spot.The results of such research will provide strong support for sentiment analysis.[Methods]First, the specific features of each uni-modal data type and the shared features across multi-modal data are extracted.Then, multi-modal fusion is achieved using cross-modal bridges.Finally, a multi-head self-attention mechanism is introduced for multi-label prediction, effectively capturing the co-occurrence relationships between different emotion labels.[Results]Experimental results on the CMU-MOSEI dataset show that the proposed model improves accuracy compared to baseline models under different parameters and in comparative experiments. The ablation study results validate the effectiveness of each module. Furthermore, compared to methods based on single text, image, and audio modalities, the model's accuracy increased by 11.4%, 19.9%, and 26.8%, respectively, demonstrating that the proposed method effectively integrates multi-modal information.[Limitations]In terms of system performance, the current method cannot accurately capture subtle emotional nuances. Furthermore, the current dataset does not cover all possible emotional expressions and cultural contexts, and more diverse data should be considered.[Conclusions]Experimental results show that the proposed model achieves effective modality fusion and performs well in sentiment analysis.