A Bloom Filter Application for Processing Big Datasets through MapReduce Framework

Milko Marinov

Department of Computer Systems and Technologies, Faculty of Electrical Engineering, Electronics and Automation, University of Ruse

Bulgaria

e-mail: mmarinov@ecs.uni-ruse.bg

Abstract:

MapReduce is a widely used programming model for processing big data. Bloom filters are spatially efficient probabilistic data structures for fast queries that tell whether an element is a member of a set and allow false positive results. With reference to this, this article discusses the main characteristics, realisation techniques and scenarios for using Bloom filters when big data is analysed and processed. In addition to this, the study presents an implementation of a Bloom filter in a distributed MapReduce framework.

Key words:

big datasets

Bloom filter

MapReduce framework

Section:

Information Technologies

Topics:

Digital Signal Processing

Software Technologies and Programming

Presentation

InfoTech conference

Conference program is published

A Bloom Filter Application for Processing Big Datasets through MapReduce Framework