Current Issue
    , Volume 7 Issue 3 Previous Issue    Next Issue
    For Selected: View Abstracts Toggle Thumbnails
    An Analysis on the Basic Technologies of ChatGPT
    Qian Li, Liu Yi, Zhang Zhixiong, Li Xuesi, Xie Jing, Xu Qinya, Li Yang, Guan Zhengyi, Li Xiyu, Wen Sen
    2023, 7 (3): 6-15.  DOI: 10.11925/infotech.2096-3467.2023.0229
    Abstract   HTML ( 146 PDF(1060KB) ( 1117 )  

    [Objective] Review and analyze the corpus, algorithms and models related to ChatGPT, and provide a systematic reference for peer research. [Methods] This paper systematically reviewed the relevant literature and materials since the release of GPT-3. We depict the overall architecture of ChatGPT technology, and explain and analyze the models, algorithms, and principles behind it. [Results] This paper restores the technical details that support ChatGPT functionality based on limited information through literature research. Rationalizing the overall technical architecture diagram of ChatGPT and explaining each technical component of it. The algorithmic principles and model composition of each technical component of ChatGPT is analyzed at three levels: the corpus system, the pre-training algorithm and model, and the fine-tuning algorithm and model. [Limitations] The investigation of the literature related to ChatGPT inevitably has omissions, and the interpretation of some technical contents is not deep enough. Some contents inferred by the authors may be incorrect. [Conclusions] The breakthrough in the application of ChatGPT technology is the result of continuous accumulation through iterative training of corpora, models and algorithms, as well as the effective combination and integration of various algorithmic models.

    Figures and Tables | References | Related Articles | Metrics
    ChatGPT Performance Evaluation on Chinese Language and Risk Measures
    Zhang Huaping, Li Linhan, Li Chunjin
    2023, 7 (3): 16-25.  DOI: 10.11925/infotech.2096-3467.2023.0214
    Abstract   HTML ( 75 PDF(798KB) ( 1411 )  

    [Objective] This paper briefly introduces the main technical innovations of ChatGPT, and evaluates the performance of ChatGPT in Chinese on four tasks using nine datasets, analyzes the risk with ChatGPT and proposes our solutions. [Methods] ChatGPT and WeLM models were tested using the ChnSentiCorp dataset, and ChatGPT and ERNIE 3.0 Titan were tested using the EPRSTMT dataset, and it was found that ChatGPT did not differ much from the large domestic models in sentiment analysis tasks. The LCSTS and TTNews datasets were used to test the ChatGPT and WeLM models, and both ChatGPT outperformed the WeLM model; CMRC2018 and DRCD were used for extractive machine reading comprehension(MRC), and the C3 dataset was used for common sense MRC, and it was found that ERNIE 3.0 Titan outperformed ChatGPT in this task. WebQA and CKBQA were used to do Chinese closed-book quiz testing, and it was found that ChatGPT was prone to make factual errors in this task, and the domestic model outperformed ChatGPT. [Results] ChatGPT performed well on classic tasks of natural language processing, such as sentiment analysis with an accuracy rate of more than 85% and a higher probability of factual errors on closed-book questions. [Limitations] The error of evaluation score may be introduced in the process of converting discriminative tasks into generative ones. This paper only evaluated ChatGPT in zero-shot case, so it is not clear how it performs in other cases. ChatGPT may be updated iteratively in subsequent releases, and the profiling results may be time-sensitive. [Conclusions] ChatGPT is powerful but still has some drawbacks, for the large model of Chinese need to be national strategy oriented and pay attention to the limitations of the language model.

    Figures and Tables | References | Related Articles | Metrics
    The Inspiration Brought by ChatGPT to LLM and the New Development Ideas of Multi-modal Large Model
    Zhao Chaoyang, Zhu Guibo, Wang Jinqiao
    2023, 7 (3): 26-35.  DOI: 10.11925/infotech.2096-3467.2023.0216
    Abstract   HTML ( 28 PDF(1583KB) ( 478 )  

    [Objective] This paper analyzes the basic technical principles of ChatGPT, and discusses its influence on the development of large language model and the development of multi-modal pretrained model. [Methods] By analyzing the development process and technical principles of ChatGPT, this paper discusses the influence of model building methods such as instruct fine-tuning, data acquisition and annotation, and reinforcement learning based on human feedback on the large language model. At the same time, this paper analyzes several key scientific problems encountered in the construction of multi-modal model, and discusses the future development of multi-modal pretrained model by referring to ChatGPT’s technical scheme. [Conclusions] The success of ChatGPT provides a good reference technology path for the development of pretrained fundamental model to downstream tasks. In the future construction of multi-modal large model and the realization of downstream tasks, we can make full use of high-quality instruction fine-tuning and other technologies to significantly improve the performance of downstream tasks.

    Figures and Tables | References | Related Articles | Metrics
    The Influence of ChatGPT on Library & Information Services
    Zhang Zhixiong, Yu Gaihong, Liu Yi, Lin Xin, Zhang Menting, Qian Li
    2023, 7 (3): 36-42.  DOI: 10.11925/infotech.2096-3467.2023.0230
    Abstract   HTML ( 81 PDF(565KB) ( 1117 )  

    [Objective] This paper aims to discuss the inspiration and influence of artificial intelligence (AI) technologies represented by ChatGPT on Literature & Information Service, and put forward suggestions for the Literature & Information Service field. [Methods] This paper explores the essence of the rapid breakthrough of AI technologies based on the evolution of AI, analyzes the impact on Literature & Information Service based on the technical capability of ChatGPT, and proposes suggestions for the development of the Literature & Information Service field to take full advantages and values of Literature & Information Service. [Results] Five insights from the rapid development of AI technology for Literature & Information Service are summarized. The impact of ChatGPT is elaborated on six aspects: data organization, knowledge service, information analysis, literature utilization, team construction and service priorities. Based on the characteristics of Literature & Information Service, nine suggestions are put forward. [Conclusions] The essence of the rapid breakthrough of AI technologies lies in the improvement of knowledge acquisition capability. Moreover, the success of ChatGPT proves that high-value corpus is the basis of all AI technologies. The Literature & Information Service field holds high-value data resources containing abundant human knowledge, which is of great importance and significance for AI technologies. ChatGPT focuses on content generation, while Literature & Information Service focuses on evidence-based work. Literature & Information Service should actively respond to and expand AI technologies to comply with the advancement of the era of AI and contribute the wisdom and solutions.

    References | Related Articles | Metrics
    A Deep Recommendation Model with Multi-Layer Interaction and Sentiment Analysis
    Li Haojun, Lv Yun, Wang Xuhui, Huang Jieya
    2023, 7 (3): 43-57.  DOI: 10.11925/infotech.2096-3467.2022.0228
    Abstract   HTML ( 30 PDF(1549KB) ( 179 )  

    [Objective] This paper proposes a deep recommendation model with multi-layer interaction and sentiment analysis. It tries to improve the traditional recommendation algorithms which rely on single user ratings to infer user preferences and ignore the impacts of sentiments. [Methods] First, we used the BRET word vector to represent the reviews, and utilized the bidirectional recurrent neural network to quantify their sentiments. Then, we updated the rating matrix using the sentiment values, and mapped the shallow features of users and resources. Fourth, we captured the deep features of users and resources from reviews with the convolutional neural network and the self-attention mechanism. Finally, we merged the shallow and deep features, and used the multi-layer perceptron to model the complex nonlinear interaction between users and resources to predict the rating of recommended resources. [Results] We examined the model with Amazon dataset and found the MAE and RMSE metrics were upto 7.93% and 9.37% lower than those of the baseline models. [Limitations] Our model did not include the temporal dynamics of user sentiment and ignore the domain adaptiveness of sentiment analysis methods. [Conclusions] The recommendation model incorporating sentiment analysis can more accurately reflect users’ real interests and preferences, and then effectively improve the recommendation accuracy.

    Figures and Tables | References | Related Articles | Metrics
    Chinese Text Sentiment Analysis Based on Dual Channel Attention Network with Hybrid Word Embedding
    Zhou Ning, Zhong Na, Jin Gaoya, Liu Bin
    2023, 7 (3): 58-68.  DOI: 10.11925/infotech.2096-3467.2022.0332
    Abstract   HTML ( 39 PDF(970KB) ( 527 )  

    [Objective] This paper addresses the challenges facing the traditional static word vector embedding method, aiming to handle polysemy in Chinese texts effectively. It also excavates the contextual emotional features and internal semantic association structure. [Methods] In one channel, we integrated the sentiment elements related to the text into Word2Vec and FastText word vectors through rough data reasoning. We also used CNN to extract the local features of the text. In the other channel, we employed BERT for word embedding supplement and used BiLSTM to obtain the global features of the texts. Finally, we added the attention calculation module for the deep interaction of dual channel features. [Results] The experiment on three Chinese datasets achieved the highest accuracy of 92.43%, representing an improvement of 0.81% over the best value of the benchmark model. [Limitations] The selected datasets are only for modelling coarse-grained sentiment classification. We did not conduct experiments in the fine-grained domain. [Conclusions] The proposed model could effectively improve the performance of Chinese text sentiment classification.

    Figures and Tables | References | Related Articles | Metrics
    Impact of Hybrid Online Customer Service on Consumer Purchase Conversion
    Li Shun, Li Li, Chen Baixue
    2023, 7 (3): 69-79.  DOI: 10.11925/infotech.2096-3467.2022.0354
    Abstract   HTML ( 25 PDF(703KB) ( 315 )  

    [Objective] This paper investigates the impacts of hybrid online customer service on consumer purchase conversion in e-commerce. [Methods] Using enterprise data as the sample, we employed the Logit model to explore the effect of human customer service, intelligent customer service, and their interaction on consumer purchase conversion. This paper also discussed the heterogeneous effects of hybrid online customer service on consumer purchase conversion with grouping regression. [Results] Under the hybrid online customer service mode, human customer service and intelligent customer service had significant positive impacts on consumer purchase conversion. There was a substitution relationship between human customer service and intelligent customer service. [Limitations] More research is needed to add consumer income and consumption habits to the proposed model and analyze conversation content. [Conclusions] This study focuses on actual consumer purchase conversion. It provides suggestions for e-commerce companies to develop effective customer service operation strategies.

    Figures and Tables | References | Related Articles | Metrics
    Simulation Research of Targeted Collaborative Decision-Making Model for Large Groups in Major Emergencies
    Hao Zhiyuan, Ma Jie, Sun Wenjing
    2023, 7 (3): 80-96.  DOI: 10.11925/infotech.2096-3467.2022.0578
    Abstract   HTML ( 15 PDF(2048KB) ( 302 )  

    [Objective] This paper tries to effectively improve the decision-making efficiency of governmental agencies in major emergencies. [Methods] From the perspective of information dissemination and based on motivation-cognition theory, we focused on the two internal driving dimensions of self-orientation and task-orientation, as well as the three external motivational dimensions induced by stimulus, group structure, and professionalism. By introducing the relevant principles of the SIR transmission model, we constructed a targeted collaboration decision model for the large groups (the TC-LGDM model). [Results] The three realistic factors, i.e., the probability of secondary events, the influence of individual decision makings, and the professional competence of members, are closely related to the final formation and efficacy of large group targeted collaborative decision-making. [Limitations] The decision-making states within the group are limited,and the additional information value formed by the decision-makers in the process of receiving information needs to be explored more deeply. [Conclusions] This study provides valuable references for decision-makers to obtain situational manifestation and trend prediction capabilities. It is also an inevitable choice to enhance the sustainable development and decision-making of government in responding to public crisis risk events.

    Figures and Tables | References | Related Articles | Metrics
    Automatic Question-Answering in Chinese Medical Q & A Community with Knowledge Graph
    Wang Yinqiu, Yu Wei, Chen Junpeng
    2023, 7 (3): 97-109.  DOI: 10.11925/infotech.2096-3467.2022.0333
    Abstract   HTML ( 28 PDF(1104KB) ( 360 )  

    [Objective] This paper proposes a new method to determine the reliability of answers from the online Chinese medical question and answer (Q&A) community, aiming to enhance the accuracy of answer selection models for medical Q&A recognition with the help of professional medical knowledge graphs. [Methods] Based on the answer selection model using a hybrid neural network (fusing RNN and multi-scale CNN to capture context and local information), we constructed a professional medical knowledge graph that integrated entity and relationship embeddings to enrich the semantic information of the Q&A text. Combined with the Q&A pair attention mechanism, we obtained the final similarity of the pairs and selected candidate answers with the highest scores. [Results] We examined the proposed model on the cMedQA2.0 dataset. Compared to the hybrid neural network model without incorporating knowledge graph entity relationship, the Top-1 accuracy of the answer selection in our new model increased by 2.3% (to 62.2%), demonstrating its effectiveness for improving answer selection. [Limitations] The medical knowledge graph used is of small size, only including the common entities in the medical community Q&A. The incomplete relationship between medical entities may affect the answer selection effectiveness when facing niche questions. [Conclusions] Combining professional Chinese medical knowledge graphs and deep learning models could improve the answer selection technology. It helps people with medical consultation needs obtain reliable medical advice in the Q & A community. Our model also monitors the online medical community’s information quality and reduces the burden of hospital outpatient service.

    Figures and Tables | References | Related Articles | Metrics
    Analyzing Asymmetric Relationship Between Documents Based on Topic Word Co-occurrence
    Zhang Guofang, Wang Xin, Xu Jianmin
    2023, 7 (3): 110-120.  DOI: 10.11925/infotech.2096-3467.2022.0342
    Abstract   HTML ( 16 PDF(1094KB) ( 86 )  

    [Objective] This paper proposes a quantitative model, aiming to explore the asymmetric relationship between documents. [Methods] Firstly, we examined the asymmetric association between topics with the help of co-occurrence. Secondly, we introduced the concept of the document coverage degree to quantify the asymmetric relationship between documents. Finally, we used document clustering to evaluate the proposed model’s performance. [Results] Compared with two existing measurement models, the average value of clustering was reduced by up to 22.6% and 23.3% with the proposed model. [Limitations] The proposed model only analyzed textual contents, which did not include pictures and formulas. [Conclusions] The proposed model could effectively improve the accuracy of document clustering.

    Figures and Tables | References | Related Articles | Metrics
    Identifying Named Entities of Adverse Drug Reaction with Adversarial Transfer Learning
    Li Daifeng, Lin Kaixin, Li Xuting
    2023, 7 (3): 121-130.  DOI: 10.11925/infotech.2096-3467.2022.0350
    Abstract   HTML ( 21 PDF(875KB) ( 144 )  

    [Objective] This paper aims to quickly generate real-time promotional book summaries and reduce the consumption of workforce and resources. [Methods] First, we constructed a dataset with the crawled book information based on prompt learning. Then, we used data enhancement and keyword extraction to increase information and generated the primary promotion language with the T5 PEGASUS. When the number of book reviews reaches the threshold, the summary of the book reviews will also be added. [Results] Compared with the optimal baseline model, the Rouge-1、Rouge-2and Rouge-L scores of the proposed model were improved by 29.0%, 37.6%, and 31.9%, respectively. Adding the summary of book reviews can reflect the interests of users. [Conclusions] The proposed model could generate summaries based on the characteristics of the book corpus and has practical value.

    Figures and Tables | References | Related Articles | Metrics
    Identifying Named Entities of Adverse Drug Reaction with Adversarial Transfer Learning
    Han Pu, Zhong Yule, Lu Haojie, Ma Shiwen
    2023, 7 (3): 131-141.  DOI: 10.11925/infotech.2096-3467.2022.0392
    Abstract   HTML ( 20 PDF(1079KB) ( 215 )  

    [Objective] This paper proposes an entity recognition model for adverse drug reactions based on adversarial transfer learning, ATL-BCA, aiming to address the problem of non-standard entity representations and insignificant boundaries in online health communities. [Methods] Firstly, we generated the external semantic feature vectors fused with the online medical domain knowledge with Word2Vec. Secondly, based on the transfer learning, we utilized the shared and private BiLSTM to extract the shared boundary information and private features for entity recognition and word segmentation tasks. Next, we used the multi-head attention mechanism to capture the overall sentence dependency and used adversarial training to filter the private information of the word segmentation task. This helped us eliminate the influence of redundant features on the entity recognition task. Finally, we predicted the label sequence results with the help of CRF constraints. [Results] We used a self-constructed social media adverse drug reaction dataset to examine the proposed model with. The F1 value of the new model reached 91.35%, which is 5.28% and 2.98% higher than Word2Vec-BiLSTM-CRF and BERT-BiLSTM-CRF. [Limitations] We only retrieved the experimental data from Sanjiu Health & Medicine Site, the scale of the constructed dataset is relatively small. [Conclusions] The ATL-BCA model fully utilizes the shared boundary information between entity recognition and word segmentation tasks. It also filters the private features of the word segmentation tasks, effectively improving the entity recognition performance of adverse drug reactions in online health communities.

    Figures and Tables | References | Related Articles | Metrics
    Medical Named Entity Recognition with Domain Knowledge
    Pei Wei, Sun Shuifa, Li Xiaolong, Lu Ji, Yang Liu, Wu Yirong
    2023, 7 (3): 142-154.  DOI: 10.11925/infotech.2096-3467.2022.0348
    Abstract   HTML ( 27 PDF(1494KB) ( 307 )  

    [Objective] This paper builds a graph neural network model integrating medical domain knowledge(GraphModel-Dict) to identify named entities from medical texts. [Methods] First, we used the graph neural network structure to integrate domain knowledge, mapping the raw text data and domain dictionaries as nodes of different categories. We also updated the nodes of raw text data with Gated Recurrent Unit (GRU) to obtain their semantic representation with domain knowledge. Then, we used the representation of the text data node as an input to a Bidirectional Long Short-Term Memory network (BiLSTM). We predicted the labels and generated recognition results with a Conditional Random Field (CRF) model. Finally, we evaluated GraphModel-Dict’s performance on two datasets. [Results] We examined the GraphModel-Dict on a manually annotated dataset of 3,100 Chinese ultrasound examination reports on breast cancer. The model’s precision, recall, and F1-score for entity recognition reached 96.91%, 97.52%, and 97.22%, respectively. Furthermore, GraphModel-Dict showed better recognition performance for entity types with fewer sample data or diverse expressions. On the CCKS2020 medical dataset, the F1-value of GraphModel-Dict increased by at least 1.39% compared to the baseline model. [Limitations] More research is needed to examine the effectiveness of the proposed model in other fields. [Conclusions] Integrating domain knowledge can improve the effectiveness of named entity recognition, which benefits medical information mining and clinical research.

    Figures and Tables | References | Related Articles | Metrics
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn