Please wait a minute...
Advanced Search
数据分析与知识发现  2017, Vol. 1 Issue (4): 57-66     https://doi.org/10.11925/infotech.2096-3467.2017.04.07
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于关联数据的类簇语义揭示模型研究
崔家旺1,2(), 李春旺1
1中国科学院文献情报中心 北京 100190
2中国科学院大学 北京100049
Identifying Semantic Relations of Clusters Based on Linked Data
Cui Jiawang1,2(), Li Chunwang1
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China
2University of Chinese Academy of Sciences, Beijing 100049, China
全文: PDF (859 KB)   HTML ( 2
输出: BibTeX | EndNote (RIS)      
摘要 

目的】调研基于关联数据揭示类簇内主题词间语义关系的模型和技术方法。【方法】利用Google Scholar、Springer、CNKI等检索与研究主题相关的文献, 调研分析并梳理当前类簇分析和语义关系揭示相关研究, 构建基于关联数据的类簇语义关系揭示模型, 通过实验验证模型的有效性。【结果】实验结果表明, 利用关联数据可以有效揭示主题词间语义关系, 弥补传统共词聚类分析在语义方面的不足。【局限】受实验数据限制, 目前揭示出的语义关系局限于上下位类关系、类与实例关系和相关关系等类型, 未考虑关联数据质量问题对语义揭示结果造成的影响。【结论】提出的基于关联数据的类簇语义关系揭示模型可以有效揭示主题词间语义关系, 为共词聚类结果的理解和分析提供一种新的方式。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
崔家旺
李春旺
关键词 关联数据共词聚类类簇语义揭示模型    
Abstract

[Objective] This paper introduces a model to identify the semantic relations for the co-word analysis results based on linked data. [Methods] First, we used Google Scholar, Springer and CNKI to retrieve the literature of the related research. Then, we analyzed the clusters relations of them. Finally, we constructed and examined the semantic relation model for clusters based on the linked data graph structure. [Results] The linked data helped us effectively explore the potential semantic relations among keywords. [Limitations] Due to the limits of the collected linked data, we only identified some sematic relationship, such as hierarchical, simple relavent, as well as classes-instance ones. More research is needed to improve the quality of linked data. [Conclusions] The proposed model could successfully discover the semantic relations among keywords, which help us get more insights from the cluster analysis.

Key wordsLinked Data    Co-word Cluster Analysis    Cluster    Semantic Relations Revealing Model
收稿日期: 2017-02-16      出版日期: 2017-05-24
ZTFLH:  G25  
引用本文:   
崔家旺, 李春旺. 基于关联数据的类簇语义揭示模型研究[J]. 数据分析与知识发现, 2017, 1(4): 57-66.
Cui Jiawang,Li Chunwang. Identifying Semantic Relations of Clusters Based on Linked Data. Data Analysis and Knowledge Discovery, 2017, 1(4): 57-66.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2017.04.07      或      http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2017/V1/I4/57
  主题词节点间关联关系示意图
  间接关联示意图
  最近公共祖先节点关联示意图
  最近公共子孙节点关联示意图
  基于关联数据的类簇语义揭示框架
  关联路径数量随路径长度变化趋势
关联路径 重要性指标 类型
$\{<\text{Cloning}>\xrightarrow{\text{http://dbpedia}\text{.org/ontology/wikiPageWikiLink}}\text{PCR }\!\!\}\!\!\text{ }$ 0.001 DR
$\{<\text{Cloning}>\xrightarrow{\text{wikiPageWikiLink}}$<Cloning_vector >$\xrightarrow{\text{wikiPageWikiLink}}$<PCR>} 0.00000072 IR
$\{<\text{Cloning}>\xrightarrow{\text{wikiPageWikiLink}}$<Bisulfite_sequencing >$\xrightarrow{\text{wikiPageWikiLink}}$<PCR>}
$\{<\text{Cloning}>\xrightarrow{\text{http://www}\text{.w3}\text{.org/2004/02/skos/core }\!\!\#\!\!\text{ broader}}<\text{Category:Cloning}>$
$\xrightarrow{\text{http://www}\text{.w3}\text{.org/2004/02/skos/core }\!\!\#\!\!\text{ broader}}<\text{Category:Biotechnology}>$
0.00000072 IR
$\xleftarrow{\text{http://purlorg/dc/terms/subject}}<\text{PCR}>\}$ 0.00118999 LCAR
$\{<\text{Cloning}>\xrightarrow{\text{wikiPageWikiLink}}$< Molecular_cloning>$\xrightarrow{\text{http://purlorg/dc/terms/subject}}$
<Category:Molecular_biology>$\xleftarrow{\text{http://purlorg/dc/terms/subject}}$<PCR>}
0.000260651 LCAR
$\{<\text{Cloning}>\xleftarrow{\text{http://purlorg/dc/terms/subject}}<\text{Category:}\ \text{Molecular }\!\!\_\!\!\text{ biology}>$
$\xrightarrow{\text{http://purlorg/dc/terms/subject}}$<PCR>}
0.00720822 LCDR
$\{<\text{Cloning}>\xleftarrow{\text{rdf:type}}<\text{http://dbpedia}\text{.org/dbtax/Technique}>\xrightarrow{\text{rdf:type}}$<PCR>} 0.00139680 LCDR
  部分关联路径综合重要性指标计算结果
序号 属性 出现频次 含义 语义关系
1 http://dbpedia.org/ontology/wikiPageWikiLink 172 300 574 对应Wikipedia的链接信息 相关关系
2 http://www.w3.org/1999/02/22-rdf-syntax-ns#type 66 418 990 资源的标签信息 类和实例关系
3 http://www.w3.org/2002/07/owl#sameAs 40 637 907 指向同义资源 等同关系
4 http://dbpedia.org/property/wikiPageUsesTemplate 36 772 939 RDF抽取所用模版信息 相关关系
5 http://dbpedia.org/ontology/wikiPageWikiLinkText 23 809 294 Wikipedia超链接的文本信息 相关关系
6 http://purl.org/dc/terms/subject 22 673 220 资源的主题信息 类和实例关系
  DBpedia高频属性(部分)
[1] 钟伟金, 李佳. 共词分析法研究(一)——共词分析的过程与方式[J]. 情报杂志, 2008, 27(5): 70-72.
[1] (Zhong Weijin, Li Jia.The Research of Co-word Analysis (1) ———The Process and Methods of Co-word Analysis[J]. Journal of Intelligence, 2008, 27(5): 70-72.)
[2] 张树良, 冷伏海. 基于文献的知识发现的应用进展研究[J]. 情报学报, 2006, 25(6): 700-712.
[2] (Zhang Shuliang, Leng Fuhai.Study on the Applicational Development of Literature-based Knowledge Discovery[J]. Journal of the China Society for Scientific and Technical Information, 2006, 25(6): 700-712.)
[3] 张晗, 任志国, 张健, 等. 基于主题词关联规则的医学文本数据库数据挖掘的尝试[J]. 医学信息学杂志, 2008, 29(1): 32-35.
[3] (Zhang Han, Ren Zhiguo, Zhang Jian, et al.Study on the Data Mining in Medical Text Database Based on Keywords Association Rules[J]. Journal of Medical Informatics, 2008, 29(1): 32-35.)
[4] 张晗, 崔雷. 生物信息学的共词分析研究[J]. 情报学报, 2003, 22(5): 613-617.
doi: 10.3969/j.issn.1000-0135.2003.05.018
[4] (Zhang Han, Cui Lei.Study of Bioinformatics through Co-word Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2003, 22(5): 613-617.)
doi: 10.3969/j.issn.1000-0135.2003.05.018
[5] Cimino J J, Barnett G O.Automatic Knowledge Acquisition from Medline[J]. Methods of Information in Medicine, 1993, 32(2): 120-130.
doi: 10.1007/BF01581301 pmid: 8321130
[6] 刘明岩. 面向语义关系发现的文本挖掘研究[D]. 南京: 南京理工大学, 2010.
[6] (Liu Mingyan.Research of Text Mining About Semantic Relation Recognition[D]. Nanjing: Nanjing University of Science and Technology, 2010.)
[7] 张小刚. 基于中医药本体的语义关系发现及验证方法[D]. 杭州: 浙江大学, 2010.
[7] (Zhang Xiaogang.Traditional Chinese Medical Ontology Based Semantic Relation Discovering and Verification [D]. Hangzhou: Zhejiang University, 2010.)
[8] 魏来. 基于在线词表的Folksonomy语义关联识别方法研究[J]. 图书情报工作, 2011, 55(5): 104-108.
[8] (Wei Lai.Research of Folksonomy Semantic Association Method Based on Online Thesaurus[J]. Library and Information Service, 2011, 55(5): 104-108.)
[9] Tiddi I, D’Aquin M, Motta E. Dedalo: Looking for Clusters Explanations in a Labyrinth of Linked Data [M]. Springer International Publishing, 2014.
[10] Taheriyan M, Knoblock C A, Szekely P, et al.Leveraging Linked Data to Infer Semantic Relations Within Structured Sources[C]// Proceedings of the 6th International Workshop on Consuming Linked Data (COLD). 2015.
[11] 李楠, 张学福. 基于关联数据的知识发现模型研究[J]. 图书馆学研究, 2013, 1: 73-77.
[11] (Li Nan, Zhang Xuefu.Research on Knowledge Discovery Based on Linked Data[J]. Researches in Library Science, 2013, 1: 73-77.)
[12] 李俊, 黄春毅. 关联数据的知识发现研究[J]. 情报科学, 2013, 31(3): 79-84.
[12] (Li Jun, Huang Chunyi.Knowledge Discovery in Linked Data[J]. Information Science, 2013, 31(3): 79-84.)
[13] 高劲松, 李迎迎, 刘龙, 等. 基于关联数据的知识发现模型构建研究[J]. 情报科学, 2016, 34(6): 10-13.
[13] (Gao Jinsong, Li Yingying, Liu Long, et al.Research on Construction of the Knowledge Discovery Model Based on Linked Data[J]. Information Science, 2016, 34(6): 10-13.)
[14] 宋丽娜. 关联数据环境下基于知识地图的隐性知识发现模型研究[D]. 武汉: 华中师范大学, 2014.
[14] (Song Lina.Research on Model of Knowledge Discovery Based on Knowledge Map Under the Environment of Linked Data [D]. Wuhan: Central China Normal University, 2014.)
[15] 刘龙. 基于关联数据的知识发现过程模型研究 [D]. 武汉: 华中师范大学, 2014.
[15] (Liu Long.Research on Model of Knowledge Discovery Process Based on Linked Data [D]. Wuhan: Central China Normal University, 2014.)
[16] Narasimha V, Kappara P, Ichise R, et al.LiDDM: A Data Mining System for Linked Data[C]// Proceedings of the 2011 Linked Data on the Web. 2011.
[17] Paulheim H, Fürnkranz J.Unsupervised Generation of Data Mining Features from Linked Open Data[C]//Proceedings of the International Conference on Web Intelligence, Mining and Semantics. 2012.
[18] Ramezani R, Saraee M, Nematbakhsh M A.Finding Association Rules in Linked Data, A Centralization Approach[C]//Proceedings of the 21st Iranian Conference on Electrical Engineering. 2013.
[19] Personeni G, Daget S, Bonnet C, et al.Mining Linked Open Data: A Case Study with Genes Responsible for Intellectual Disability [M]. Springer International Publishing, 2014.
[20] Jiang X, Zhang X, Gao F, et al.Graph Compression Strategies for Instance-Focused Semantic Mining[C]//Proceedings of the 7th Chinese Semantic Web Symposium on Linked Data and Knowledge Graph. 2013.
[21] Li K, Gao J, Guo S, et al.LRBM: A Restricted Boltzmann Machine Based Approach for Representation Learning on Linked Data[C]// Proceedings of the IEEE International Conference on Data Mining. 2014.
[22] 夏立新, 谭荧. LOD的网络结构分析与可视化[J]. 现代图书情报技术, 2016(1): 65-72.
[22] (Xia Lixin, Tan Ying.Analysis and Visualization of the LOD Network Structure[J]. New Technology of Library and Information Service, 2016(1): 65-72.)
[23] Meymandpour R, Davis J G.Linked Data Informativeness[M]. Springer Berlin Heidelberg, 2013.
[24] Kasneci G, Elbassuoni S, Weikum G.MING: Mining Informative Entity-Relationship Subgraphs[C]// Proceedings of the 18th ACM Conference on Information and Knowledge Management. 2009.
[25] Balmin A, Hristidis V, Papakonstantinou Y.Objectrank: Authority-based Keyword Search in Databases[C]// Proceedings of the 30th International Conference on Very Large Data Bases.2004.
[26] Nie Z, Zhang Y, Wen J R, et al.Object-level Ranking: Bringing Order to Web Objects[C]//Proceedings of the 2005 International Conference on World Wide Web. 2005.
[27] Ng M K P, Li X T, Ye Y M. MultiRank: Co-ranking for Objects and Relations in Multi-relational Data[C]// Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2011.
[28] 蒋世银, 李春旺. 基于关联数据的科研机构评价研究述评[J]. 情报理论与实践, 2015, 38(2): 136-140.
[28] (Jiang Shiyin, Li Chunwang.Review on the Evaluation of Scientific Research Institution Based on Linked Data[J]. Information Studies: Theory & Application, 2015, 38(2): 136-140.)
[29] Bamba B, Mukherjea S.Utilizing Resource Importance for Ranking Semantic Web Query Results[C]//Proceedings of the 2nd International Conference on Semantic Web and Databases. 2004.
[30] Franz T, Schultz A, Sizov S, et al.TripleRank: Ranking Semantic Web Data by Tensor Decomposition[C]// Proceedings of the International Semantic Web Conference. 2009.
[31] Hulpus I, Prangnawarat N, Hayes C.Path-Based Semantic Relatedness on Linked Data and Its Use to Word and Entity Disambiguation[C]// Proceedings of the International Semantic Web Conference. 2015.
[32] 岳阳, 孙静, 石达友, 等. 基于共词分析的兽医分子生物学领域研究热点分析及初步展望[J]. 广东畜牧兽医科技, 2015, 40(2): 1-4.
doi: 10.3969/j.issn.1005-8567.2015.02.001
[32] (Yue Yang, Sun Jing, Shi Dayou, et al.Interpretation and Preliminary Outlook of the Research Focus in Veterinary Molecular Biology Based on the Co-word Analysis[J]. Guangdong Journal of Animal and Veterinary Science, 2015, 40(2): 1-4.)
doi: 10.3969/j.issn.1005-8567.2015.02.001
[1] 沈志宏,姚畅,侯艳飞,吴林寰,李跃鹏. 关联大数据管理技术: 挑战、对策与实践*[J]. 数据分析与知识发现, 2018, 2(1): 9-20.
[2] 姜赢, 张婧, 朱玲萱. 面向Cytoscape平台的关联数据知识图谱概览抽取与可视化*[J]. 数据分析与知识发现, 2017, 1(3): 29-37.
[3] 齐云飞, 赵宇翔, 朱庆华. 关联数据在数字图书馆移动视觉搜索系统中的应用研究*[J]. 数据分析与知识发现, 2017, 1(1): 81-90.
[4] 赵夷平,毕强. 关联数据在学术资源网相似文献发现中的应用研究*[J]. 现代图书情报技术, 2016, 32(3): 41-49.
[5] 郭振英, 赵文兵, 魏育辉. 轻量级书目本体关联数据建设实践[J]. 现代图书情报技术, 2015, 31(7-8): 139-143.
[6] 高劲松, 程娅, 梁艳琪. 面向关联数据集的本体匹配方法研究[J]. 现代图书情报技术, 2015, 31(6): 33-40.
[7] 梁艺多, 翟军. 本体推理在关联数据链接发现中的应用研究[J]. 现代图书情报技术, 2015, 31(4): 87-95.
[8] 高劲松, 梁艳琪, 李珂, 肖涟, 周习曼. 面向关联数据的电子商务信用信息服务模型研究[J]. 现代图书情报技术, 2014, 30(6): 8-16.
[9] 虞为, 陈俊鹏. 基于MapReduce的书目数据关联匹配研究[J]. 现代图书情报技术, 2013, 29(9): 15-22.
[10] 王忠义, 夏立新, 石义金, 郑森茂. 数字图书馆中层关联数据的创建与发布[J]. 现代图书情报技术, 2013, (5): 28-33.
[11] 刘炜, 夏翠娟, 张春景. 大数据与关联数据:正在到来的数据技术革命[J]. 现代图书情报技术, 2013, (4): 2-9.
[12] 夏翠娟. RDB2RDF标准及应用研究[J]. 现代图书情报技术, 2013, (4): 10-17.
[13] 朱雯晶, 夏翠娟, 刘炜. SILK关联发现框架综析[J]. 现代图书情报技术, 2013, (4): 18-24.
[14] 钟远薪, 李田章, 刘炜. OPAC混搭关联数据应用研究[J]. 现代图书情报技术, 2013, (4): 25-29.
[15] 高劲松, 梁艳琪, 马倩倩, 周习曼, 付旭雄. 面向关联数据的引文知识链接模式研究[J]. 现代图书情报技术, 2013, 29(3): 21-26.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn