Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (8): 132-144    DOI: 10.11925/infotech.2096-3467.2020.1221
Current Issue | Archive | Adv Search |
Extracting Knowledge Elements of Sci-Tech Literature Based on Artificial and Machine Features
Chai Qingfeng1,2,Shi Linyan2,Mei Shan2,Xiong Haitao2,He Huixin1()
1College of Computer Science and Technology, Huaqiao University, Quanzhou 361021, China
2Tongfang Knowledge Network Technology Co., Ltd. (Beijing), Beijing 100192, China
Download: PDF (1120 KB)   HTML ( 19
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper merged the artificial and machine features of scientific and technological literature with the help of deep learning method, aiming to improve the efficiency of knowledge element extraction. [Methods] We constructed 26 artificial features based on the characteristics of these literature, which mainly included texts, sentences and words. Then, we combinted these features with Word2Vec, one-hot and other machine features using LSTM, CNN and BERT models and extracted knowledge elements. [Results] The accuracy of feature vertical merging for knowledge element extraction reached 0.91, which was 6 percentage points higher than the performance of most traditional methods. [Limitations] The deep learning model needs to be optimized to process larger amount of data. [Conclusions] The proposed method could effectively improve the results of knowledge element extraction.

Key wordsKnowledge Element Extraction      Artificial Feature      Machine Feature      Sci-Tech Literature     
Received: 06 December 2020      Published: 15 September 2021
ZTFLH:  G250  
Fund:National Social Science Fund of China(19BXW110)
Corresponding Authors: He Huixin ORCID: 0000-0002-1764-6727     E-mail: huixinhe@qq.com

Cite this article:

Chai Qingfeng, Shi Linyan, Mei Shan, Xiong Haitao, He Huixin. Extracting Knowledge Elements of Sci-Tech Literature Based on Artificial and Machine Features. Data Analysis and Knowledge Discovery, 2021, 5(8): 132-144.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.1221     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I8/132

维度1 维度2 解释 计算方式
paper_feature sen_order 句子在正文全文中的句序 -
sense_rate 句子在本篇文章所有句子的句序比 sen_order + 1) / paper_ct,其中paper_ct为正文全文的句子总数
paper_ct 句子所在文献正文全文的句子总数 -
yeshu 句子所在文献的页数 -
zzgs 句子所在文献的中英文作者数 -
yxyz 句子所在文献的影响因子值 -
sense_ feature num_top_ci 句子中包含的结论特征词个数
结论特征词为1 831个人工整理的结论特征词,如本文推测、本研究中我们还发现等
-
chinese_word_len 句子中的中文字数 -
word_rate 句子中的中文字数占所在文献正文全文中文总字数比例 word _ rate = c h inese _ word _ len word _ ct _ all , word _ ct _ all > 0 0 , word _ ct _ all 0,
其中,word_ct_all为句子所在文献正文全文中文总字数
other_lange_len 句子中非中文字数 sen _ len - c h inese _ word _ len,其中sen_len为句子长度
word_rate_jielun 结论段句子中文字数占全文总句子中的中文词个数比 word _ rate _ jielun = c h inese _ word _ len word _ ct _ all _ jielun , word _ ct _ all _ jielun > 0 0 , word _ ct _ all _ jielun 0
其中 word _ ct _ all _ jielun = 10 000 , word _ ct _ all = 0 word _ ct _ all , 其他
num_quotes 句子中的引号数量 -
Isyin 句子中是否存在引用 存在为1,否则为0
sen_len 句子长度 -
num_semic 句子中分号数量 -
num_comma 句子中逗号数量 -
num_colon 句子中句号数量 -
num_caesura 句子中顿号数量 -
word_ feature len_kword_same 词集合 W j,与Top200个高CHI词共现数 W j CHI _ W
len_tword_same 句子Jieba分词后集合 W _ al l j与篇名中词集合 W _ t j的共现数 W _ al l j W _ t j
len_key_word_
zwgjc_same
集合 W j与文章作者定义的关键词集合 W _ k j的共现数 W j W _ k j
jielun_ feature jielunci_first 句子处于结论段是否通过句子开头关键词匹配找到 是为1,否为0
jielunci_in 句子处于结论段是否通过句子中间关键词找到的 是为1,否为0
order_jielun 句子在结论段中的句序 -
is_jielunduan 句子是否处于结论段 是为1,否为0
CHI_vec sense_CHI Jieba分词后去除部分停用词stword,CHI值在Top200的词对应的CHI值 -
Description and Calculation of Artificial Features
LSTM_CNN Process of Knowledge Element Extraction System
数据标记示例
<标记全文>=1临床资料<研究材料>患者男,77岁,银屑病史30余年。1974年确诊,曾住院进行系统治疗无效,后经温泉疗法,使其躯干及四肢皮损消失,但很快又复发,至今未再治疗。2005年5月9日体检B超示肝可疑占位病变(患者原有肝血管瘤病史),故进一步行肝CT及肝MRI检查。于5月11日接受CT检查,示肝多发巨大血管瘤可能性大,建议再做MRI定性检查。患者于CT检查1周后,自己发现原患30年的银屑病皮损明显好转。6月14日接受MRI检查,示肝脏多发海绵状血管瘤,最大直径13cm。MRI检查后2~3天,患者的银屑病皮损进一步好转。询问病史,平时常规服用老年病药物多年,否认近期增加过新的药物,亦未食用过特殊食物。查体:患者存在于颜面部、颈部、躯干及四肢不同程度的红斑、鳞屑及抓痕基本消失,仅留有明显消退期色素沉着。患者自感良好,无瘙痒感。</研究材料>2讨论<研究结论>银屑病俗称牛皮癣,是一种常见的慢性且易反复的红斑鳞屑性皮肤病。自觉瘙痒,皮肤损害主要为红斑,上覆多层银白色鳞屑。急性期损害处鲜红,四周有红晕,搔抓之后有明显的点状出血。静止期到消退期,红晕消退,鳞屑变细,皮损面积变小以至消失,最后成为淡白色或色素沉着斑。本病迄今尚无根治方法,治疗多采用维生素药物、抗生素药物、皮质激素类药物、免疫抑制剂、中医中药疗法及物理疗法。药物疗法有的虽能取得较快效果,但大部分患者停药后很快复发,个别还会发生严重的药物不良反应、重金属中毒以及诱发糖尿病、高血压病、胃出血等。物理疗法近些年推崇的主要是短波紫外线光化学疗法(PUVB),此方法避免了药物治疗的毒副作用,但仍易复发,不能根治。<研究展望>本例银屑病皮损的明显好转,与上述治疗无关。除考虑季节、气候、心情等因素促成的不药而愈外,是否接受CT及MRI后获得的一种意外性治疗结果,尚需探讨。</研究展望></研究结论>本例现象以供临床同道观察。<研究问题><研究对象>CT及MRI检查后银屑病皮损好转</研究对象>1例</研究问题>
Example of Data Mark
句子 真实标签 预测标签
作者将自制的PMI修饰于碳粉电极表面制成修饰电极,该电极对银离子有选择络合作用,可以将银富集于电极表面,具有较高的灵敏度和选择性。 1 0
结果表明,纯碳糊电极不能富集Ag+,无电流峰出现;而当修饰剂PMI质量分数为2·0%时,所得Ag+的峰电流最高,可见PMI质量分数。 1 1
目前,测定铂含量常用的方法是氯化亚锡比色法;采用此法时消除共存离子的干扰比较困难,且操作繁琐费时。 0 0
用乙酸-乙酸钠缓冲溶液控制体系的酸度,当pH为5·3时,用量为2 mL时吸光度最大。 1 1
该方法有良好的选择性,方法可用于血清和人尿液中盐酸吗啉胍的测定。 1 1
结果表明,HAc-NaAc的效果最好。 1 1
已有研究证明在LDHs层间插入能络合重金属离子的有机配体形成杂化物,可提高LDHs吸附重金属离子的能力。 0 1
EDTA-LDH的h和k2值明显高于Mg-Al LDH,表明Pb2+在EDTA-LDH杂化物上的吸附速率明显高于Mg-Al LDH,可归因于EDTA的增效作用。 1 1
相同ce下EDTA-LDH纳米杂化物的吸附量明显高于Mg-Al LDH,表明EDTA具有协同增效作用。 1 1
本文合成了Mg-Al型LDH及其乙二胺四乙酸(EDTA)-LDH杂化物,考察了其对Pb2+的吸附作用,以期为高效污水处理剂和污染土壤修复剂的研制提供理论依据。 0 1
盐酸吗啉胍(病毒灵,ABOB),其化学名称为N-N-(2-胍基-乙亚氨基)-吗啉盐酸盐。 0 0
Example of Conclusion Sentence Extraction in Model 3
P Values of Six Models with epoch
R Values of Six Models with epoch
F1Values of Six Models with epoch
模型 特点 不足与改进
模型1 仅考虑embedding层的Word2Vec机器特征,没有特征融合 不足:特征单一,结果显示P值、R值和F1值均在0.85左右,且随着epoch的增加,提升不大;
改进:加入人工特征,进行特征融合后展开深度学习
模型2 人工特征纵向融合Embedding层的one-hot传统机器特征 不足:使用传统的one-hot机器特征向量表示,P值、R值和F1值在0.90左右,但是随着epoch的增加,效果反而略降;
改进:使用Word2Vec替换传统one-hot特征表示,如模型3,使特征的向量化更为准确
模型3 人工特征纵向融合Embedding层的Word2Vec机器特征,且随着epoch的增加训练效果越来越好 不足:人工特征考虑了多个维度,权重均为1,重要度差异需要考虑;
改进:在6个模型中,效果最好,后续研究可加入指标重要度差异表示
模型4 人工特征横向融合Embedding层的Word2Vec机器特征 不足:在每个词向量的最后加入人工特征向量,大大增加了向量的横向长度,且P值、R值和F1值显示效果不佳,为0.85左右;
改进:转换横向融合为纵向融合,如模型3
模型5 仅考虑BERT输出的机器特征,没有特征融合 不足:特征单一,模型深度增加,训练缓慢,效果不佳;
改进:加入人工特征融合,因BERT预训练模型向量长度的限制,仅可考虑横向融合
模型6 人工特征横向融合BERT输出的机器特征 不足:模型深度增加,且横向特征融合使向量长度大幅增加,但是训练效果提升不明显;
改进:降低模型深度,可以考虑将BERT向量提前存储,后续直接读取
Effect Comparison of Six Models
[1] 刘则渊. 知识图谱的若干问题思考[R]. 大连理工大学 WISE 实验室, 2010.
[1] ( Liu Zeyuan. Some Thoughts on Knowledge Graph[R]. WISE Laboratory of Dalian University of Technology, 2010.)
[2] 高继平, 丁堃, 潘云涛, 等. 知识元研究述评[J]. 情报理论与实践, 2015, 38(7):134-138.
[2] ( Gao Jiping, Ding Kun, Pan Yuntao, et al. A Review of Knowledge Unit Research[J]. Information Studies: Theory & Application, 2015, 38(7):134-138.)
[3] 贺惠新, 刘丽娟. 主动学习的科技文献研究对象标引体系研究[J]. 数据分析与知识发现, 2016, 32(3):67-73.
[3] ( He Huixin, Liu Lijuan. Research on Indexing System of Research Objects of Scientific and Technological Literature Based on Active Learning[J]. Data Analysis and Knowledge Discovery, 2016, 32(3):67-73.)
[4] 化柏林. 国内外知识抽取研究进展综述[J]. 情报杂志, 2008, 27(2):60-62.
[4] ( Hua Bolin. Development of Research on Knowledge Extraction in China and Overseas[J]. Journal of Information, 2008, 27(2):60-62.)
[5] 冯青文. 知识抽取国内研究现状分析[J]. 常州信息职业技术学院学报, 2017, 16(2):32-36.
[5] ( Feng Qingwen. Analysis on Status of Knowledge Extraction in China[J]. Journal of Changzhou Vocational College of Information Technology, 2017, 16(2):32-36.)
[6] 朱玲, 朱彦, 杨峰. 基于中医疾病相关语义关系的正则表达式及知识抽取研究[J]. 世界科学技术:中医药现代化, 2016, 18(8):1241-1250.
[6] ( Zhu Ling, Zhu Yan, Yang Feng. Knowledge Extraction Research for Semantic Expression of Diseases in Chinese Medicine[J]. World Science and Technology-Modernization of Traditional Chinese Medicine, 2016, 18(8):1241-1250.)
[7] 丁君军, 郑彦宁, 化柏林. 基于规则的学术概念属性抽取[J]. 情报理论与实践, 2011, 34(12):10-14, 33.
[7] ( Ding Junjun, Zheng Yanning, Hua Bolin. Rule Based Attribute Extraction of Academic Concepts[J]. Information Studies:Theory & Application, 2011, 34(12):10-14, 33.)
[8] Alam M, Gangemi A, Presutti V, et al. Semantic Role Labeling for Knowledge Graph Extraction from Text[J]. Progress in Artificial Intelligence, 2021. https://doi.org/10.1007/s13748-021-00241-7.
[9] 石湘, 刘萍. 基于知识元语义描述模型的领域知识抽取与表示研究——以信息检索领域为例[J]. 数据分析与知识发现, 2021, 5(4):123-133.
[9] ( Shi Xiang, Liu Ping. Extraction and Representation of Domain Knowledge with Semantic Description Model and Knowledge Elements——Case Study of Information Retrieval[J]. Data Analysis and Knowledge Discovery, 2021, 5(4):123-133.)
[10] 翟劼, 裘江南. 基于规则的知识元属性抽取方法研究[J]. 情报科学, 2016, 34(4):43-47.
[10] ( Zhai Jie, Qiu Jiangnan. Research on the Rule-based Knowledge Unit Attributes Extraction Method[J]. Information Science, 2016, 34(4):43-47.)
[11] 王忠义, 沈雪莹, 黄京. 科技文献资源中方法知识元的抽取研究[J]. 情报科学, 2021, 39(1):13-20.
[11] ( Wang Zhongyi, Shen Xueying, Huang Jing. Research on Extraction of Method Knowledge Element in Scientific Literature[J]. Information Science, 2021, 39(1):13-20.)
[12] 张金柱, 胡一鸣. 融合表示学习与机器学习的专利科学引文标题自动抽取研究[J]. 数据分析与知识发现, 2019, 3(5):68-76.
[12] ( Zhang Jinzhu, Hu Yiming. Extracting Titles from Scientific References in Patents with Fusion of Representation Learning and Machine Learning[J]. Data Analysis and Knowledge Discovery, 2019, 3(5):68-76.)
[13] Kambhatla N. Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Extracting Relations[C]// Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. 2004: 22-25.
[14] Huang Z X, Xie Z P. A Patent Keywords Extraction Method Using TextRank Model with Prior Public Knowledge[J]. Complex & Intelligent Systems, 2021. https://doi.org/10.1007/s40747-021-00343-8.
[15] Liu S, He T H, Dai J H. A Survey of CRF Algorithm Based Knowledge Extraction of Elementary Mathematics in Chinese[J]. Mobile Networks and Applications, 2021. https://doi.org/10.1007/s11036-020-01725-x.
[16] Zelenko D, Aone C, Richardella A. Kernel Methods for Relation Extraction[J]. Journal of Machine Learning Research, 2003, 3:1083-1106.
[17] Lin Y F, Tsai T, Chou W C, et al. A Maximum Entropy Approach to Biomedical Named Entity Recognition[C]// Proceedings of the 4th International Conference on Data Mining in Bioinformatics. 2008: 56-61.
[18] Arovski S, Osipyan H, Oladele M I, et al. Automatic Knowledge Extraction of Any Chatbot from Conversation[J]. Expert Systems with Applications, 2019, 137:343-348.
doi: 10.1016/j.eswa.2019.07.014
[19] Londhe S N, Shah S. A Novel Approach for Knowledge Extraction from Artificial Neural Networks[J]. ISH Journal of Hydraulic Engineering, 2019, 25(3):269-281.
[20] Jiao Y R, Qu Q X. A Proposal for Kansei Knowledge Extraction Method Based on Natural Language Processing Technology and Online Product Reviews[J]. Computers in Industry, 2019, 108:1-11.
doi: 10.1016/j.compind.2019.02.011
[21] Li P L, Yuan Z M, Tu We B, et al. Medical Knowledge Extraction and Analysis from Electronic Medical Records Using Deep Learning[J]. Chinese Medical Science Journal, 2019, 34(2):133-139.
[22] 孙安. 序列标注模型中不同输入特征组合的集成学习与直推学习方法研究——以CCKS-2018电子病历命名实体识别任务为例[J]. 情报杂志, 2019, 38(10):176-184.
[22] ( Sun An. Research on Ensemble Learning of Different Input Feature Combinations and Transdcutive Learning in Sequense Labeling Modeling—A Case Study about Clinical Named Entity Recognition of CCKS-2018[J]. Journal of Intelligence, 2019, 38(10):176-184.)
[23] 张弛, 张贯虹. 基于词向量和多特征语义距离的文本聚类算法[J]. 重庆科技学院学报(自然科学版), 2019, 21(3):69-72, 77.
[23] ( Zhang Chi, Zhang Guanhong. Text Clustering Algorithm Based on Word Vector and Multi-feature Semantic Distance[J]. Journal of Chongqing University of Science and Technology (Natural Science Edition), 2019, 21(3):69-72, 77.)
[24] 王斌, 郭剑毅, 线岩团, 等. 融合多特征的基于远程监督的中文领域实体关系抽取[J]. 模式识别与人工智能, 2019, 32(2):133-143.
[24] ( Wang Bin, Guo Jianyi, Xian Yantuan, et al. Entity Relation Extraction in Chinese Domain Based on Distant Supervison with Multi-feature Fusion[J]. Pattern Recognition and Artificial Intelligence, 2019, 32(2):133-143.)
[25] 吴璠, 李寿山, 周国栋. 基于LSTM和多特征组合的电影评论专业程度分类[J]. 计算机科学, 2019, 46(6A):74-79.
[25] ( Wu Fan, Li Shoushan, Zhou Guodong. Movie Review Professionalism Classification Using LSTM and Feature Fusion[J]. Computer Science, 2019, 46(6A):74-79.)
[26] 韩普, 张展鹏, 张明淘, 等. 基于多特征融合的中文疾病名称归一化研究[J]. 数据分析与知识发现, 2021, 5(5):83-94.
[26] ( Han Pu, Zhang Zhanpeng, Zhang Mingtao, et al. Normalization of Chinese Disease Names Based on Multi Feature Fusion[J]. Data Analysis and Knowledge Discovery, 2021, 5(5):83-94.)
[27] 石义金, 王忠义, 沈雪莹, 等. 基于序列模式的科技文献中知识元抽取研究[J]. 情报理论与实践, 2020, 43(11):144-149.
[27] ( Shi Yijin, Wang Zhongyi, Shen Xueying, et al. Extraction of Knowledge Elements in Scientific Literature Based on Sequential Patterns[J]. Information Studies: Theory & Application, 2020, 43(11):144-149.)
[28] Chang X, Zheng Q H. Knowledge Element Extraction for Knowledge-based Learning Resources Organization[M]. Heidelberg: Spinger, 2008: 102-113.
[29] 黎丹雨. 基于多特征融合的电影推荐系统[J]. 计算机与现代化, 2019(8):121-126.
[29] ( Li Danyu. Movie Recommendation System Based on Multi-feature Fusion[J]. Computer and Modernization, 2019(8):121-126.)
[30] 王哲. 多特征融合的深层网络图像高级语义识别方法研究[D]. 太原: 太原理工大学, 2019.
[30] ( Wang Zhe. Research on Image Advanced Semantic Recognition Method of Deep Network with Multi-feature Fusion[D]. Taiyuan: Taiyuan University of Technology, 2019.)
[31] 马中启, 朱好生, 杨海仕, 等. 基于多特征融合密集残差CNN的人脸表情识别[J]. 计算机应用与软件, 2019, 36(7):197-201.
[31] ( Ma Zhongqi, Zhu Haosheng, Yang Haishi, et al. Facial Expression Recognition Based on Multi-Feature Fusion Dense Residual CNN[J]. Computer Applications and Software, 2019, 36(7):197-201.)
[32] 凌海彬. 基于多特征融合的微博情感分析研究[D]. 桂林: 桂林电子科技大学, 2019.
[32] ( Ling Haibin. Research on Microblog Emotion Analysis Based on Multi Feature Fusion[D]. Guilin: Guilin University of Electronic Tecnology, 2019.)
[33] 买买提阿依甫, 吾守尔·斯拉木, 艾斯卡尔·艾木都拉, 等. 基于多特征和深度神经网络的维吾尔文情感分类[J]. 计算机应用研究, 2020, 37(5):1368-1374, 1379.
[33] ( Maimaitiayifu, Wushouer Silamu, Aisikaer Aimudoula, et al. Uyghur Sentiment Classification Based on Multi-features and Deep Neural Network[J]. Application Research of Computers, 2020, 37(5):1368-1374, 1379.)
[34] Zhong W F, Fang X, Fan C H, et al. Fusion of Deep Shallow Features and Models for Speaker Recognition[J]. Chinese Journal of Acoustics, 2018, 43(2):263-272.
[35] Yang L Z, Ban X J, Mukeshimana M, et al. Multiple Feature Fusion for Unimodal Emotion Recognition[J]. The Journal of China Universities of Posts and Telecommunications, 2019, 26(2):17-29.
[36] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[37] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8):1735-1780.
pmid: 9377276
[38] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
doi: 10.1109/5.726791
[39] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[C]// Proceedings of the 1st International Conference on Learning Representations. 2013.
[1] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[2] Wang Qinjie, Qin Chunxiu, Ma Xubu, Liu Huailiang, Xu Cunzhen. Recommending Scientific Literature Based on Author Preference and Heterogeneous Information Network[J]. 数据分析与知识发现, 2021, 5(8): 54-64.
[3] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[4] Li He,Liu Jiayu,Li Shiyu,Wu Di,Jin Shuaiqi. Optimizing Automatic Question Answering System Based on Disease Knowledge Graph[J]. 数据分析与知识发现, 2021, 5(5): 115-126.
[5] Li Yueyan,Wang Hao,Deng Sanhong,Wang Wei. Research Trends of Information Retrieval——Case Study of SIGIR Conference Papers[J]. 数据分析与知识发现, 2021, 5(4): 13-24.
[6] Yi Huifang,Liu Xiwen. Analyzing Patent Technology Topics with IPC Context-Enhanced Context-LDA Model[J]. 数据分析与知识发现, 2021, 5(4): 25-36.
[7] Wang Hongbin,Wang Jianxiong,Zhang Yafei,Yang Heng. Topic Recognition of News Reports with Imbalanced Contents[J]. 数据分析与知识发现, 2021, 5(3): 109-120.
[8] Chang Zhijun,Qian Li,Xie Jing,Wu Zhenxin,Zhang Hu,Yu Qianqian,Wang Ying,Wang Yongji. Big Data Platform for Sci-Tech Literature Based on Distributed Technology[J]. 数据分析与知识发现, 2021, 5(3): 69-77.
[9] Hu Shaohu,Zhang Yingyi,Zhang Chengzhi. Review of Keyword Extraction Studies[J]. 数据分析与知识发现, 2021, 5(3): 45-59.
[10] Liu Tong, Liu Chen, Ni Weijian. A semi-supervised Chinese sentiment analysis method based on multi-level data augmentation [J]. 数据分析与知识发现, 0, (): 1-.
[11] Wang Hongbin, Wang Jianxiong, Zhang Yafei, Yang Heng. Topic Recognition Research on Topic Imbalanced News Text Data Set [J]. 数据分析与知识发现, 0, (): 1-.
[12] Sifan Zhang, Zhendong Niu, Hao Lu, Yifan Zhu, Rongrong Wang. Graph Convolution Embedding and Feature Cross Based Literature Citation Prediction Method:Taking the Transportation Field as An Example [J]. 数据分析与知识发现, 0, (): 1-.
[13] Qi Ruihua, Jian Yue, Guo Xu, Guan Jinghua, Yang Mingxi. Sentiment Analysis of Cross-Domain Product Reviews Based on Feature Fusion and Attention Mechanism [J]. 数据分析与知识发现, 0, (): 1-.
[14] Li Jiao, Huang Yongwen, Luo Tingting, Zhao Ruixue, Xian Guojian. Automatic Classification based on Multi-factor Algorithm [J]. 数据分析与知识发现, 0, (): 1-.
[15] Wang Sili, Zhu Zhongming, Yang Heng, Liu Wei. Research on Automatic Identification of Hypernym-Hyponym Relations of Domain Concepts Based on Pattern and Projection Learning [J]. 数据分析与知识发现, 0, (): 1-.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn