Scale-Out Processing of Large RDF Datasets

Aus International Center for Computational Logic
Wechseln zu:Navigation, Suche

Toggle side column

Scale-Out Processing of Large RDF Datasets

Long ChengLong Cheng,  Spyros KotoulasSpyros Kotoulas
Long Cheng, Spyros Kotoulas
Scale-Out Processing of Large RDF Datasets
IEEE Transactions on Big Data, 1(4):138-150, December 2015
  • KurzfassungAbstract
    Distributed RDF data management systems become increasingly important with the growth of the Semantic Web. Regardless, current methods meet performance bottlenecks either on data loading or querying when processing large amounts of data. In this work, we propose efficient methods for processing RDF using dynamic data re-partitioning to enable rapid analysis of large datasets. Our approach adopts a two-tier index architecture on each computation node: (1) a lightweight primary index, to keep loading times low, and (2) a series of dynamic, multi-level secondary indexes, calculated as a by-product of query execution, to decrease or remove inter-machine data movement for subsequent queries that contain the same graph patterns. In addition, we propose methods to replace some secondary indexes with distributed filters, so as to decrease memory consumption. Experimental results on a commodity cluster with 16 nodes show that the method presents good scale-out characteristics and can indeed vastly improve loading speeds while remaining competitive in terms of performance. Specifically, our approach can load a dataset of 1.1 billion triples at a rate of 2.48 million triples per second and provide competitive performance to RDF-3X and 4store for expensive queries.
  • Projekt:Project: DIAMONDHAEC B08
  • Forschungsgruppe:Research Group: Wissensbasierte SystemeKnowledge-Based Systems
@article{CK2015,
  author    = {Long Cheng and Spyros Kotoulas},
  title     = {Scale-Out Processing of Large {RDF} Datasets},
  journal   = {IEEE Transactions on Big Data},
  volume    = {1},
  number    = {4},
  publisher = {IEEE},
  year      = {2015},
  month     = {December},
  pages     = {138-150}
}