Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (10): 37-46    DOI: 10.11925/infotech.2096-3467.2019.0252
Current Issue | Archive | Adv Search |
An Interactive Analysis Framework for Multivariate Heterogeneous Graph Data Management System
Zihao Zhao1,2,Zhihong Shen1()
1Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
2University of Chinese Academy of Sciences, Beijing 100049, China
Download: PDF (1861 KB)   HTML ( 15
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] An open and scalable interactive analysis framework is proposed to shield the differences between multivariate graph data models, management systems, interfaces and protocols, and supply the online interactive analyzing service faced with graph data. [Methods] By abstracting the multi-analysis requirements and heterogeneous service interfaces, an open, scalable and interactive protocol is designed. Based on the protocol, an interactive framework is designed to implement the interactive module. [Results] This interactive analysis framework is well abstracted, shields the heterogeneity of graph management systems like Neo4j and Jena effectively, and provides a good foundation for front-end applications. [Limitations] Need to be optimized and adjusted on large-scale data. [Conclusions] The interactive analysis framework of heterogeneous knowledge graph has practical significance and deserves promotion.

Key wordsGraph Data      Interactive Analysis      Interactive Framework     
Received: 05 March 2019      Published: 25 November 2019
ZTFLH:  TP393  
Corresponding Authors: Zhihong Shen     E-mail: bluejoe@cnic.cn

Cite this article:

Zihao Zhao,Zhihong Shen. An Interactive Analysis Framework for Multivariate Heterogeneous Graph Data Management System. Data Analysis and Knowledge Discovery, 2019, 3(10): 37-46.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0252     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I10/37

查询语言 适用模型 特性对比
TRIPLE RDF模型 仅支持简单图模式的查询, 学习成本高。
RQL RDF模型 引入聚合操作, 与SQL语法较为相似, 学习成本稍高。
SeRQL RDF模型 与SPARQL标准较为接近, 学习成本低。
SPARQL RDF模型 被W3C推荐为标准, 语法与SQL相似, 学习成本低。
GraphQL 属性图模型 API型查询语言, 灵活性好, 但是学习成本较高。
Gremlin 属性图模型 图灵完备, 类似于编程语言, 较灵活, 但学习成本高; 主要用于遍历。
Cypher 属性图模型 语法与SQL相似, 比较成熟, 学习成本低。
PGQL 属性图模型 语法与SQL相似, 学习成本较低, 但目前使用者不多。
图数据管理系统 数据模型 支持查询语言 特性对比
Neo4j 属性图模型 Cypher, Gremlin 很成熟, 生态良好, 但不支持数据分片。
Titan 属性图模型 Gremlin 可挂接HBase等存储后端, 支持数据分片。
InfiniteGraph 属性图模型 Gremlin 支持数据分片, 免费版只支持100万节点。
Cosmos DB 属性图模型 Gremlin 基于云平台, 支持数据分片, 但不开源。
AllegroGraph RDF模型 SPARQL 支持数据分片, 免费版支持5 000万三元组。
Jena RDF模型 SPARQL 完全开源, 比较成熟, 开发使用便捷。
Virtuoso RDF模型 SPARQL 基于关系表实现RDF管理, 在大规模数据下性能不足。
Neptune 属性图/RDF Gremlin/SPARQL 基于云平台, 支持数据分片, 但不开源。
图谱可视化分析系统 支持数据源 特性对比
RelFinder 支持SPARQL的RDF数据集 有一定分析能力, 但仅支持RDF数据的可视化。
Gephi CSV、GML等文件格式 分析能力较强, 与主流数据管理系统融合不够。
Bloom Neo4j Neo4j提供的可视化工具, 需要与Cypher语言结合实现分析功能。
Vis.js Gephi/DOT语言 JavaScript库, 部署方便; 无法显示大规模的图数据, 图分析功能不足。
Alchemy.js JSON JavaScript库, 部署方便, 仅提供节点和边的可视化。
Node-centric RDF Graph
Visualization
RDF 仅支持RDF数据, 对大度节点显示效果不好。
PGV RDF 仅支持RDF数据, 环状显示节点, 分析功能不足。
WebVOWL RDF 仅支持5MB以内的RDF数据文件。
接口类别 接口名称 功能描述 输入 输出
1 连接类 1.1 Connect 连接当前数据源并初始化 当前数据源的元数据(规模、社区等)
2浏览类 2.1 LoadGraph 加载当前图的信息 当前位置 图中全部顶点和边
2.2 GetCommunityData 加载挖掘社区的信息 社区的轮廓及包含的顶点
2.3 GetNodesInfo 获取节点的描述信息 节点id 节点描述信息
2.4 GetNodeCategories 获取节点的类别(label) 类别及其描述
3探索类 3.1 GetNeighbours 获取节点邻边及邻居节点 节点id 节点邻边及邻居节点
4实体匹配类 4.1 FilterNodesByCategory 根据节点类别过滤节点集 节点id数组、指定类别 获取节点数组中属于指定类别的节点
4.2 Search 以关键词和限制数为条件搜索节点 关键词、限制数 指定字段中包含关键词的节点, 输出节点数不超过限制数
5路径查询类 5.1 FindRelations 在起止节点之间查找不超过最大深度的路径 起止节点id、最大深度 查询任务id
5.2 GetMoreRelations 从路径查询任务的缓冲中取出更多结果 查询任务id 查询得到的路径
5.3 StopFindRelations 停止查询任务 查询任务id 查询任务id及状态
项目名称 托管地址 开发语言
InteractiveGraphServer-Neo4j适配器 https://github.com/grapheco/InteractiveGraph-neo4j Scala
InteractiveGraphServer-RDF适配器 https://github.com/grapheco/InteractiveGraph-RDF Java、Scala
AppFrame及应用 https://github.com/grapheco/InteractiveGraph Typescript
[1] Angles R, Gutierrez C. An Introduction to Graph Data Management: Fundamental Issues and Recent Developments[A]// Fletcher G, Hidders J, Larriba-Pey J L, et al. Graph Data Management[M]. Springer International Publishing, 2018.
[2] What is a Graph Database? [EB/OL]. [ 2018- 12- 05]. .
[3] RDF Model and Syntax Specification[EB/OL]. [ 2018- 12- 02]. .
[4] Alocci D, Mariethoz J, Horlacher O , et al. Property Graph vs RDF Triple Store: A Comparison on Glycan Substructure Search[J]. PLoS One, 2015,10(12):e0144578.
[5] Neo4j Open Source NoSQL Graph Database[EB/OL]. [ 2018- 12- 08]. .
[6] Titan-Distributed Graph Database[EB/OL]. [ 2018- 12- 12]. .
[7] Microsoft Azure Cosmos DB[EB/OL]. [2019-01-11]..
[8] InfiniteGraph[EB/OL]. [2019-01-15]..
[9] Carroll J, Dickinson I, Dollin C, et al. Jena: Implementing the Semantic Web Recommendations [C]// Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers & Posters. ACM, 2004: 74-83.
[10] Erling O, Mikhailov I . RDF Support in the Virtuoso DBMS[A]// Pellegrini T, Auer S, Tochtermann K, et al. Networked Knowledge-Networked Media[M]. Springer, 2009.
[11] AllegroGraph-Semantic Graph Database[EB/OL]. [2019-01-11]..
[12] Amazon Neptune-Fast , Reliable Graph Database Build for Cloud[EB/OL]. [2019-01-11]..
[13] Liu Y A, Stoller S D . Querying Complex Graphs [C]// Proceedings of the 8th International Symposium on Practical Aspects of Declarative Languages. Springer, 2006: 199-214.
[14] GraphQL[EB/OL]. [ 2019- 01- 16]. .
[15] Rodriguez M A. The Gremlin Graph Traversal Machine and Language (Invited Talk) [C]// Proceedings of the 15th Symposium on Database Programming Languages. ACM, 2015: 1-10.
[16] Francis N, Green A, Guagliardo P , et al. Cypher: An Evolving Query Language for Property Graphs [C]// Proceedings of the 2018 International Conference on Management of Data. ACM, 2018: 1433-1445.
[17] van Rest O, Hong S, Kim J , et al. PGQL: A Property Graph Query Language[C]// Proceedings of the 4th International Workshop on Graph Data Management Experiences & Systems. 2016: Article No. 7.
[18] Sintek M, Decker S . TRIPLE—A Query, Inference, and Transformation Language for the Semantic Web [C]// Proceedings of the 1st International Semantic Web Conference. 2002: 364-378.
[19] Karvounarakis G, Alexaki S, Christophides V , et al. RQL: A Declarative Query Language for RDF [C]// Proceedings of the 11th International Conference on World Wide Web. 2002: 592-603.
[20] Broekstra J, Kampman A . SeRQL: An RDF Query and Transformation, Language DRAFT [C]//Proceedings of the 3rd International Semantic Web Conference. 2004.
[21] SPARQL 1.1 Query Language [EB/OL].[2018-12-18]. .
[22] Palmer S, Rock I . Rethinking Perceptual Organization: The Role of Uniform Connectedness[J]. Psychonomic Bulletin & Review, 1994,1(1):29-55.
[23] Heim P, Hellmann S, Lehmann J , et al. RelFinder: Revealing Relationships in RDF Knowledge Bases [C]// Proceedings of the 4th International Conference on Semantic and Digital Media Technologies. 2009: 182-187.
[24] The Open Graph Viz Platform[EB/OL]. [2019-01-03]..
[25] Graph Visualization with Neo4j[EB/OL]. [2018-12-25]..
[26] Lohmann S, Link V, Marbach E , et al. WebVOWL: Web-based Visualization of Ontologies [C]// Proceedings of the 2014 International Conference on Knowledge Engineering and Knowledge Management. 2014: 154-158.
[27] Deligiannidis L, Kochut K J, Sheth A P. RDF Data Exploration and Visualization [C]// Proceedings of the ACM 1st Workshop on CyberInfrastructure: Information Management in eScience. ACM, 2007: 39-46.
[28] Sayers C . Node-Centric RDF Graph Visualization[J]. Mobile and Media Systems Laboratory, HP Labs, 2004.
[29] Vis.js[EB/OL]. [ 2018- 12- 30]. .
[30] Alchemy.js[EB/OL].[2019-01-18]. .
[1] Shen Zhihong,Zhao Zihao,Wang Haibo. Big Data Technology Stack Shifting: From SQL Centric to Graph Centric[J]. 数据分析与知识发现, 2020, 4(7): 50-65.
[2] Dongsheng Zhai, He Liu, Jie Zhang, Liwei Cai. Managing Patent Semantic Knowledge with Graph Database[J]. 数据分析与知识发现, 2016, 32(12): 66-75.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn