Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics

Ibrahim Abdelaziz, Mohammad Razen Al-Harbi, Semih Salihoglu, Panos Kalnis

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

Modern applications, such as drug repositioning, require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to support programs that combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries. Spartex builds on existing vertex-centric graph processing frameworks, such as Graphlab or Pregel. It implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. In addition, any graph algorithm implemented in the underlying vertex-centric framework, can be executed in Spartex. We present various scenarios where our framework simplifies significantly the implementation of complex RDF data analytics programs. We demonstrate that Spartex scales to datasets with billions of edges, and show that our core SPARQL engine is at least as fast as the state-of-the-art specialized RDF engines. For complex analytical tasks that combine generic graph processing with SPARQL, Spartex is at least an order of magnitude faster than existing alternatives.
Original languageEnglish (US)
Pages (from-to)3374-3388
Number of pages15
JournalIEEE Transactions on Parallel and Distributed Systems
Volume28
Issue number12
DOIs
StatePublished - Jun 27 2017

Fingerprint

Dive into the research topics of 'Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics'. Together they form a unique fingerprint.

Cite this