pybloomer is a Python 3 compatible fork of pybloomfiltermmap by @axiak.
The goal of pybloomer
is simple: to provide a fast, simple, scalable, correct library for Bloom filters in Python.
This module implements a Bloom filter in Cython (ANSI C) that's fast and uses memory-mapped files for better scalability.
There are a couple reasons to use this module:
- It natively uses mmaped files.
- It is fast (see benchmarks).
- It natively does the set things you want a Bloom filter to do.
To install pybloomer
, use the Python 3 version of pip:
$ pip install pybloomer
Here’s a quick example:
>>> import pybloomer
>>> fruits = pybloomer.BloomFilter(capacity=10000000, error_rate=0.01, filename='/tmp/fruits.bloom')
>>> fruits.update(('apple', 'pear', 'orange', 'apple'))
>>> len(fruits)
3
>>> 'mike' in fruits
False
>>> 'orange' in fruits
True
To create an in-memory filter, simply omit the file location:
cake_ingredients = pybloomer.BloomFilter(capacity=1000, error_rate=0.1)
Caveat: in-memory filters cannot be persisted to disk.
Current docs are available at pybloomer.rtfd.io.
Suggestions, bug reports, and / or patches are welcome!
When contributing, you should set up an appropriate Python 3 environment and install the dependencies listed
in requirements-dev.txt
.
Package installation depends on a generated pybloomer.c
file, which requires Cython module to be in your current
environment.
See the LICENSE file. It's under the MIT License.