Data Analysis and Knowledge Discovery

Select

An Analysis on the Basic Technologies of ChatGPT

Qian Li, Liu Yi, Zhang Zhixiong, Li Xuesi, Xie Jing, Xu Qinya, Li Yang, Guan Zhengyi, Li Xiyu, Wen Sen

Data Analysis and Knowledge Discovery. 2023, 7(3): 6-15. https://doi.org/10.11925/infotech.2096-3467.2023.0229

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] Review and analyze the corpus, algorithms and models related to ChatGPT, and provide a systematic reference for peer research. [Methods] This paper systematically reviewed the relevant literature and materials since the release of GPT-3. We depict the overall architecture of ChatGPT technology, and explain and analyze the models, algorithms, and principles behind it. [Results] This paper restores the technical details that support ChatGPT functionality based on limited information through literature research. Rationalizing the overall technical architecture diagram of ChatGPT and explaining each technical component of it. The algorithmic principles and model composition of each technical component of ChatGPT is analyzed at three levels: the corpus system, the pre-training algorithm and model, and the fine-tuning algorithm and model. [Limitations] The investigation of the literature related to ChatGPT inevitably has omissions, and the interpretation of some technical contents is not deep enough. Some contents inferred by the authors may be incorrect. [Conclusions] The breakthrough in the application of ChatGPT technology is the result of continuous accumulation through iterative training of corpora, models and algorithms, as well as the effective combination and integration of various algorithmic models.

Select

ChatGPT Performance Evaluation on Chinese Language and Risk Measures

Zhang Huaping, Li Linhan, Li Chunjin

Data Analysis and Knowledge Discovery. 2023, 7(3): 16-25. https://doi.org/10.11925/infotech.2096-3467.2023.0214

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper briefly introduces the main technical innovations of ChatGPT, and evaluates the performance of ChatGPT in Chinese on four tasks using nine datasets, analyzes the risk with ChatGPT and proposes our solutions. [Methods] ChatGPT and WeLM models were tested using the ChnSentiCorp dataset, and ChatGPT and ERNIE 3.0 Titan were tested using the EPRSTMT dataset, and it was found that ChatGPT did not differ much from the large domestic models in sentiment analysis tasks. The LCSTS and TTNews datasets were used to test the ChatGPT and WeLM models, and both ChatGPT outperformed the WeLM model; CMRC2018 and DRCD were used for extractive machine reading comprehension(MRC), and the C³ dataset was used for common sense MRC, and it was found that ERNIE 3.0 Titan outperformed ChatGPT in this task. WebQA and CKBQA were used to do Chinese closed-book quiz testing, and it was found that ChatGPT was prone to make factual errors in this task, and the domestic model outperformed ChatGPT. [Results] ChatGPT performed well on classic tasks of natural language processing, such as sentiment analysis with an accuracy rate of more than 85% and a higher probability of factual errors on closed-book questions. [Limitations] The error of evaluation score may be introduced in the process of converting discriminative tasks into generative ones. This paper only evaluated ChatGPT in zero-shot case, so it is not clear how it performs in other cases. ChatGPT may be updated iteratively in subsequent releases, and the profiling results may be time-sensitive. [Conclusions] ChatGPT is powerful but still has some drawbacks, for the large model of Chinese need to be national strategy oriented and pay attention to the limitations of the language model.

Select

The Inspiration Brought by ChatGPT to LLM and the New Development Ideas of Multi-modal Large Model

Zhao Chaoyang, Zhu Guibo, Wang Jinqiao

Data Analysis and Knowledge Discovery. 2023, 7(3): 26-35. https://doi.org/10.11925/infotech.2096-3467.2023.0216

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper analyzes the basic technical principles of ChatGPT, and discusses its influence on the development of large language model and the development of multi-modal pretrained model. [Methods] By analyzing the development process and technical principles of ChatGPT, this paper discusses the influence of model building methods such as instruct fine-tuning, data acquisition and annotation, and reinforcement learning based on human feedback on the large language model. At the same time, this paper analyzes several key scientific problems encountered in the construction of multi-modal model, and discusses the future development of multi-modal pretrained model by referring to ChatGPT’s technical scheme. [Conclusions] The success of ChatGPT provides a good reference technology path for the development of pretrained fundamental model to downstream tasks. In the future construction of multi-modal large model and the realization of downstream tasks, we can make full use of high-quality instruction fine-tuning and other technologies to significantly improve the performance of downstream tasks.

Select

The Influence of ChatGPT on Library & Information Services

Zhang Zhixiong, Yu Gaihong, Liu Yi, Lin Xin, Zhang Menting, Qian Li

Data Analysis and Knowledge Discovery. 2023, 7(3): 36-42. https://doi.org/10.11925/infotech.2096-3467.2023.0230

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper aims to discuss the inspiration and influence of artificial intelligence (AI) technologies represented by ChatGPT on Literature & Information Service, and put forward suggestions for the Literature & Information Service field. [Methods] This paper explores the essence of the rapid breakthrough of AI technologies based on the evolution of AI, analyzes the impact on Literature & Information Service based on the technical capability of ChatGPT, and proposes suggestions for the development of the Literature & Information Service field to take full advantages and values of Literature & Information Service. [Results] Five insights from the rapid development of AI technology for Literature & Information Service are summarized. The impact of ChatGPT is elaborated on six aspects: data organization, knowledge service, information analysis, literature utilization, team construction and service priorities. Based on the characteristics of Literature & Information Service, nine suggestions are put forward. [Conclusions] The essence of the rapid breakthrough of AI technologies lies in the improvement of knowledge acquisition capability. Moreover, the success of ChatGPT proves that high-value corpus is the basis of all AI technologies. The Literature & Information Service field holds high-value data resources containing abundant human knowledge, which is of great importance and significance for AI technologies. ChatGPT focuses on content generation, while Literature & Information Service focuses on evidence-based work. Literature & Information Service should actively respond to and expand AI technologies to comply with the advancement of the era of AI and contribute the wisdom and solutions.

Select

A Deep Recommendation Model with Multi-Layer Interaction and Sentiment Analysis

Li Haojun, Lv Yun, Wang Xuhui, Huang Jieya

Data Analysis and Knowledge Discovery. 2023, 7(3): 43-57. https://doi.org/10.11925/infotech.2096-3467.2022.0228

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper proposes a deep recommendation model with multi-layer interaction and sentiment analysis. It tries to improve the traditional recommendation algorithms which rely on single user ratings to infer user preferences and ignore the impacts of sentiments. [Methods] First, we used the BRET word vector to represent the reviews, and utilized the bidirectional recurrent neural network to quantify their sentiments. Then, we updated the rating matrix using the sentiment values, and mapped the shallow features of users and resources. Fourth, we captured the deep features of users and resources from reviews with the convolutional neural network and the self-attention mechanism. Finally, we merged the shallow and deep features, and used the multi-layer perceptron to model the complex nonlinear interaction between users and resources to predict the rating of recommended resources. [Results] We examined the model with Amazon dataset and found the MAE and RMSE metrics were upto 7.93% and 9.37% lower than those of the baseline models. [Limitations] Our model did not include the temporal dynamics of user sentiment and ignore the domain adaptiveness of sentiment analysis methods. [Conclusions] The recommendation model incorporating sentiment analysis can more accurately reflect users’ real interests and preferences, and then effectively improve the recommendation accuracy.

Select

Chinese Text Sentiment Analysis Based on Dual Channel Attention Network with Hybrid Word Embedding

Zhou Ning, Zhong Na, Jin Gaoya, Liu Bin

Data Analysis and Knowledge Discovery. 2023, 7(3): 58-68. https://doi.org/10.11925/infotech.2096-3467.2022.0332

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper addresses the challenges facing the traditional static word vector embedding method, aiming to handle polysemy in Chinese texts effectively. It also excavates the contextual emotional features and internal semantic association structure. [Methods] In one channel, we integrated the sentiment elements related to the text into Word2Vec and FastText word vectors through rough data reasoning. We also used CNN to extract the local features of the text. In the other channel, we employed BERT for word embedding supplement and used BiLSTM to obtain the global features of the texts. Finally, we added the attention calculation module for the deep interaction of dual channel features. [Results] The experiment on three Chinese datasets achieved the highest accuracy of 92.43%, representing an improvement of 0.81% over the best value of the benchmark model. [Limitations] The selected datasets are only for modelling coarse-grained sentiment classification. We did not conduct experiments in the fine-grained domain. [Conclusions] The proposed model could effectively improve the performance of Chinese text sentiment classification.

Select

Impact of Hybrid Online Customer Service on Consumer Purchase Conversion

Li Shun, Li Li, Chen Baixue

Data Analysis and Knowledge Discovery. 2023, 7(3): 69-79. https://doi.org/10.11925/infotech.2096-3467.2022.0354

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper investigates the impacts of hybrid online customer service on consumer purchase conversion in e-commerce. [Methods] Using enterprise data as the sample, we employed the Logit model to explore the effect of human customer service, intelligent customer service, and their interaction on consumer purchase conversion. This paper also discussed the heterogeneous effects of hybrid online customer service on consumer purchase conversion with grouping regression. [Results] Under the hybrid online customer service mode, human customer service and intelligent customer service had significant positive impacts on consumer purchase conversion. There was a substitution relationship between human customer service and intelligent customer service. [Limitations] More research is needed to add consumer income and consumption habits to the proposed model and analyze conversation content. [Conclusions] This study focuses on actual consumer purchase conversion. It provides suggestions for e-commerce companies to develop effective customer service operation strategies.

Select

Simulation Research of Targeted Collaborative Decision-Making Model for Large Groups in Major Emergencies

Hao Zhiyuan, Ma Jie, Sun Wenjing

Data Analysis and Knowledge Discovery. 2023, 7(3): 80-96. https://doi.org/10.11925/infotech.2096-3467.2022.0578

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper tries to effectively improve the decision-making efficiency of governmental agencies in major emergencies. [Methods] From the perspective of information dissemination and based on motivation-cognition theory, we focused on the two internal driving dimensions of self-orientation and task-orientation, as well as the three external motivational dimensions induced by stimulus, group structure, and professionalism. By introducing the relevant principles of the SIR transmission model, we constructed a targeted collaboration decision model for the large groups (the TC-LGDM model). [Results] The three realistic factors, i.e., the probability of secondary events, the influence of individual decision makings, and the professional competence of members, are closely related to the final formation and efficacy of large group targeted collaborative decision-making. [Limitations] The decision-making states within the group are limited,and the additional information value formed by the decision-makers in the process of receiving information needs to be explored more deeply. [Conclusions] This study provides valuable references for decision-makers to obtain situational manifestation and trend prediction capabilities. It is also an inevitable choice to enhance the sustainable development and decision-making of government in responding to public crisis risk events.

Select

Automatic Question-Answering in Chinese Medical Q & A Community with Knowledge Graph

Wang Yinqiu, Yu Wei, Chen Junpeng

Data Analysis and Knowledge Discovery. 2023, 7(3): 97-109. https://doi.org/10.11925/infotech.2096-3467.2022.0333

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper proposes a new method to determine the reliability of answers from the online Chinese medical question and answer (Q&A) community, aiming to enhance the accuracy of answer selection models for medical Q&A recognition with the help of professional medical knowledge graphs. [Methods] Based on the answer selection model using a hybrid neural network (fusing RNN and multi-scale CNN to capture context and local information), we constructed a professional medical knowledge graph that integrated entity and relationship embeddings to enrich the semantic information of the Q&A text. Combined with the Q&A pair attention mechanism, we obtained the final similarity of the pairs and selected candidate answers with the highest scores. [Results] We examined the proposed model on the cMedQA2.0 dataset. Compared to the hybrid neural network model without incorporating knowledge graph entity relationship, the Top-1 accuracy of the answer selection in our new model increased by 2.3% (to 62.2%), demonstrating its effectiveness for improving answer selection. [Limitations] The medical knowledge graph used is of small size, only including the common entities in the medical community Q&A. The incomplete relationship between medical entities may affect the answer selection effectiveness when facing niche questions. [Conclusions] Combining professional Chinese medical knowledge graphs and deep learning models could improve the answer selection technology. It helps people with medical consultation needs obtain reliable medical advice in the Q & A community. Our model also monitors the online medical community’s information quality and reduces the burden of hospital outpatient service.

Select

Analyzing Asymmetric Relationship Between Documents Based on Topic Word Co-occurrence

Zhang Guofang, Wang Xin, Xu Jianmin

Data Analysis and Knowledge Discovery. 2023, 7(3): 110-120. https://doi.org/10.11925/infotech.2096-3467.2022.0342

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper proposes a quantitative model, aiming to explore the asymmetric relationship between documents. [Methods] Firstly, we examined the asymmetric association between topics with the help of co-occurrence. Secondly, we introduced the concept of the document coverage degree to quantify the asymmetric relationship between documents. Finally, we used document clustering to evaluate the proposed model’s performance. [Results] Compared with two existing measurement models, the average value of clustering was reduced by up to 22.6% and 23.3% with the proposed model. [Limitations] The proposed model only analyzed textual contents, which did not include pictures and formulas. [Conclusions] The proposed model could effectively improve the accuracy of document clustering.

Select

Identifying Named Entities of Adverse Drug Reaction with Adversarial Transfer Learning

Li Daifeng, Lin Kaixin, Li Xuting

Data Analysis and Knowledge Discovery. 2023, 7(3): 121-130. https://doi.org/10.11925/infotech.2096-3467.2022.0350

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper aims to quickly generate real-time promotional book summaries and reduce the consumption of workforce and resources. [Methods] First, we constructed a dataset with the crawled book information based on prompt learning. Then, we used data enhancement and keyword extraction to increase information and generated the primary promotion language with the T5 PEGASUS. When the number of book reviews reaches the threshold, the summary of the book reviews will also be added. [Results] Compared with the optimal baseline model, the Rouge-1、Rouge-2、and Rouge-L scores of the proposed model were improved by 29.0%, 37.6%, and 31.9%, respectively. Adding the summary of book reviews can reflect the interests of users. [Conclusions] The proposed model could generate summaries based on the characteristics of the book corpus and has practical value.

Select

Identifying Named Entities of Adverse Drug Reaction with Adversarial Transfer Learning

Han Pu, Zhong Yule, Lu Haojie, Ma Shiwen

Data Analysis and Knowledge Discovery. 2023, 7(3): 131-141. https://doi.org/10.11925/infotech.2096-3467.2022.0392

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper proposes an entity recognition model for adverse drug reactions based on adversarial transfer learning, ATL-BCA, aiming to address the problem of non-standard entity representations and insignificant boundaries in online health communities. [Methods] Firstly, we generated the external semantic feature vectors fused with the online medical domain knowledge with Word2Vec. Secondly, based on the transfer learning, we utilized the shared and private BiLSTM to extract the shared boundary information and private features for entity recognition and word segmentation tasks. Next, we used the multi-head attention mechanism to capture the overall sentence dependency and used adversarial training to filter the private information of the word segmentation task. This helped us eliminate the influence of redundant features on the entity recognition task. Finally, we predicted the label sequence results with the help of CRF constraints. [Results] We used a self-constructed social media adverse drug reaction dataset to examine the proposed model with. The F1 value of the new model reached 91.35%, which is 5.28% and 2.98% higher than Word2Vec-BiLSTM-CRF and BERT-BiLSTM-CRF. [Limitations] We only retrieved the experimental data from Sanjiu Health & Medicine Site, the scale of the constructed dataset is relatively small. [Conclusions] The ATL-BCA model fully utilizes the shared boundary information between entity recognition and word segmentation tasks. It also filters the private features of the word segmentation tasks, effectively improving the entity recognition performance of adverse drug reactions in online health communities.

Select

Medical Named Entity Recognition with Domain Knowledge

Pei Wei, Sun Shuifa, Li Xiaolong, Lu Ji, Yang Liu, Wu Yirong

Data Analysis and Knowledge Discovery. 2023, 7(3): 142-154. https://doi.org/10.11925/infotech.2096-3467.2022.0348

Abstract ( ) Download PDF ( ) HTML ( )

Knowledge map

Save

[Objective] This paper builds a graph neural network model integrating medical domain knowledge(GraphModel-Dict) to identify named entities from medical texts. [Methods] First, we used the graph neural network structure to integrate domain knowledge, mapping the raw text data and domain dictionaries as nodes of different categories. We also updated the nodes of raw text data with Gated Recurrent Unit (GRU) to obtain their semantic representation with domain knowledge. Then, we used the representation of the text data node as an input to a Bidirectional Long Short-Term Memory network (BiLSTM). We predicted the labels and generated recognition results with a Conditional Random Field (CRF) model. Finally, we evaluated GraphModel-Dict’s performance on two datasets. [Results] We examined the GraphModel-Dict on a manually annotated dataset of 3,100 Chinese ultrasound examination reports on breast cancer. The model’s precision, recall, and F1-score for entity recognition reached 96.91%, 97.52%, and 97.22%, respectively. Furthermore, GraphModel-Dict showed better recognition performance for entity types with fewer sample data or diverse expressions. On the CCKS2020 medical dataset, the F1-value of GraphModel-Dict increased by at least 1.39% compared to the baseline model. [Limitations] More research is needed to examine the effectiveness of the proposed model in other fields. [Conclusions] Integrating domain knowledge can improve the effectiveness of named entity recognition, which benefits medical information mining and clinical research.

Please choose a citation manager

Content to export

25 March 2023, Volume 7 Issue 3

模态框（Modal）标题

Please choose a citation manager

Content to export

25 March 2023, Volume 7 Issue 3