Skip to main content

InfoTech conference

2022 IEEE International Conference on Information Technologies

A Bloom Filter Application for Processing Big Datasets through MapReduce Framework

Milko Marinov
Department of Computer Systems and Technologies, Faculty of Electrical Engineering, Electronics and Automation, University of Ruse

MapReduce is a widely used programming model for processing big data. Bloom filters are spatially efficient probabilistic data structures for fast queries that tell whether an element is a member of a set and allow false positive results. With reference to this, this article discusses the main characteristics, realisation techniques and scenarios for using Bloom filters when big data is analysed and processed. In addition to this, the study presents an implementation of a Bloom filter in a distributed MapReduce framework.

Key words:
big datasets
Bloom filter
MapReduce framework