Milko Marinov
Department of Computer Systems and Technologies, Faculty of Electrical Engineering, Electronics and Automation, University of Ruse
Bulgaria
e-mail: mmarinov@ecs.uni-ruse.bg
Abstract:
MapReduce is a widely used programming model for processing big data. Bloom filters are spatially efficient probabilistic data structures for fast queries that tell whether an element is a member of a set and allow false positive results. With reference to this, this article discusses the main characteristics, realisation techniques and scenarios for using Bloom filters when big data is analysed and processed. In addition to this, the study presents an implementation of a Bloom filter in a distributed MapReduce framework.
Key words:
big datasets
Bloom filter
MapReduce framework
Section: