Quick Start Guide
This guide will walk you through a minimal example of creating a Cuckoo Filter, adding items to it, and checking for their existence.
The Goal
We will create a filter with a capacity for 1 million items, insert numbers from 0 to 999,999, and then verify their presence. We will also check for non-existent items to measure the filter's false positive rate.
The full source code for this example is available in example/test.cc
.
Step-by-Step
1. Include the Header
First, include the main Cuckoo Filter header file in your C++ source.
#include "cuckoofilter.h"
#include <iostream>
#include <assert.h>
using cuckoofilter::CuckooFilter;
2. Instantiate the Filter
Next, create an instance of the CuckooFilter
. The filter is a template class that requires two main parameters:
ItemType
: The data type of the items you will store (e.g.,size_t
,uint64_t
,std::string
).bits_per_item
: The number of bits to use for each item's fingerprint. This value directly impacts the filter's memory usage and false positive rate. 12 bits is a common choice.
The constructor takes the maximum number of keys you expect to store.
size_t total_items = 1000000;
// Create a cuckoo filter for items of type size_t,
// using 12 bits for each item's fingerprint.
CuckooFilter<size_t, 12> filter(total_items);
3. Add Items
Use the Add()
method to insert items into the filter. It's good practice to check the return status, as an insertion can fail if the filter runs out of space.
size_t num_inserted = 0;
for (size_t i = 0; i < total_items; i++, num_inserted++) {
if (filter.Add(i) != cuckoofilter::Ok) {
std::cout << "Failed to insert item " << i << std::endl;
break;
}
}
4. Check for Existing Items
Use the Contain()
method to check if an item is in the filter. This method should always return cuckoofilter::Ok
for items that were successfully inserted.
// Check if all previously inserted items are in the filter.
for (size_t i = 0; i < num_inserted; i++) {
assert(filter.Contain(i) == cuckoofilter::Ok);
}
5. Measure False Positive Rate
A key characteristic of a Cuckoo Filter is its false positive rate. To measure this, we query for items that we know were never inserted. Some of these queries may incorrectly return Ok
—these are the false positives.
size_t total_queries = 0;
size_t false_queries = 0;
// Query for items that were never inserted.
for (size_t i = total_items; i < 2 * total_items; i++) {
if (filter.Contain(i) == cuckoofilter::Ok) {
false_queries++;
}
total_queries++;
}
// Output the measured false positive rate.
std::cout << "False positive rate is "
<< 100.0 * false_queries / total_queries << "%\n";
Compile and Run
You can compile and run this example using the provided Makefile
:
# Build the executable 'test'
make test
# Run it
./test
Expected Output
You should see an output similar to this, indicating a low false positive rate:
false positive rate is 0.1705%