Cuckoo Filter Overview

Cuckoo Filter is a high-performance, space-efficient probabilistic data structure used for approximated set-membership queries. It serves a similar purpose to a Bloom filter, allowing you to ask, "Is item x in my set?" with a certain degree of uncertainty.

While Bloom filters are highly space-efficient, they have a significant limitation: they do not support the deletion of items. Standard workarounds, like counting Bloom filters, often require much more space.

Cuckoo filters solve this problem by providing the flexibility to add and remove items dynamically. Based on the principles of cuckoo hashing, a cuckoo filter is essentially a compact cuckoo hash table that stores a unique "fingerprint" for each key. This design allows cuckoo filters to be highly compact, often using less space than conventional Bloom filters for applications that require low false positive rates (e.g., < 3%).

This library provides a C++ implementation of the Cuckoo Filter algorithm.

Key Features

  • Dynamic Item Management: Supports adding and deleting items after initial construction.
  • High Space Efficiency: Often requires less memory than Bloom filters for the same false positive rate.
  • Fast Lookups: Contain operations are very fast.
  • Tunable False Positive Rate: The space-to-FPR ratio can be configured by adjusting the number of bits per item.

Original Paper

For a detailed explanation of the algorithm, performance characteristics, and formal proofs, please refer to the original paper:

"Cuckoo Filter: Practically Better Than Bloom" by Bin Fan, Dave Andersen, and Michael Kaminsky, published in proceedings of ACM CoNEXT 2014.

License

This project is licensed under the Apache License, Version 2.0. You can find the full license text in the repository.