Framework for the Specification and Execution of parallel Clustering Algorithms

Aus International Center for Computational Logic
Wechseln zu:Navigation, Suche

Framework for the Specification and Execution of parallel Clustering Algorithms

Vortrag von Alexander Krause
Based on the increasing data, which originates from growing sensor networks, logging or other sources, we need to analyze more and more data. For this purpose, the process of data mining is being used, especially the cluster analysis. With the cluster analysis, we want to identify elements in a data crowd, whose properties are similar to or even matching those of other elements of the same data crowd. To preserve a maximum of flexibility for the analysis process itself, several execution environments are necessary. Unfortunately, every implementation of an algorithm is mostly depending on a specific execution environment. This work evaluates an approach, which uses the MapReduce model for specifying platform independent clustering algorithms and which exploits the underlying hardware’s parallelism. Therefore, a specification language will be presented, which can be used for platform independent specification and how it can be transformed into platform dependent code for a targeted execution environment. The functionality of this approach will be demonstrated by concrete examples based on MapReduce and OpenCL implementations.