Robust and Skew-resistant Parallel Joins in Shared-nothing Systems

Aus International Center for Computational Logic
Wechseln zu:Navigation, Suche

Toggle side column

Robust and Skew-resistant Parallel Joins in Shared-nothing Systems

Long ChengLong Cheng,  Spyros KotoulasSpyros Kotoulas,  Tomas E. WardTomas E. Ward,  Georgios TheodoropoulosGeorgios Theodoropoulos
Long Cheng, Spyros Kotoulas, Tomas E. Ward, Georgios Theodoropoulos
Robust and Skew-resistant Parallel Joins in Shared-nothing Systems
Proc. 23rd ACM International Conference on Information and Knowledge Management (CIKM'14), 1399-1408, November 2014. ACM
  • KurzfassungAbstract
    The performance of joins in parallel database management systems is critical for data intensive operations such as querying. Since data skew is common in many applications, poorly engineered join operations result in load imbalance and performance bottlenecks. State-of-the-art methods designed to handle this problem offer significant improvements over naive implementations. However, performance could be further improved by removing the dependency on global skew knowledge and broadcasting. In this paper, we propose PRPQ (partial redistribution & partial query), an efficient and robust join algorithm for processing large-scale joins over distributed systems. We present the detailed implementation and a quantitative evaluation of our method. The experimental results demonstrate that the proposed PRPQ algorithm is indeed robust and scalable under a wide range of skew conditions. Specially, compared to the state-of-art PRPD method, we achieve 16% - 167% performance improvement and 24% - 54% less network communication under different join workloads.
  • Weitere Informationen unter:Further Information: Link
  • Forschungsgruppe:Research Group: Wissensbasierte SystemeKnowledge-Based Systems
@inproceedings{CKWT2014,
  author    = {Long Cheng and Spyros Kotoulas and Tomas E. Ward and Georgios
               Theodoropoulos},
  title     = {Robust and Skew-resistant Parallel Joins in Shared-nothing
               Systems},
  booktitle = {Proc. 23rd {ACM} International Conference on Information and
               Knowledge Management (CIKM'14)},
  publisher = {ACM},
  year      = {2014},
  month     = {November},
  pages     = {1399-1408},
  doi       = {10.1145/2661829.2661888}
}