Performance and Benchmarks

The repository includes a suite of benchmarks in the benchmarks/ directory to measure the performance, space efficiency, and false positive rate of the Cuckoo Filter and related data structures.

Building the Benchmarks

To build all benchmark executables, navigate to the benchmarks directory and run make.

cd benchmarks
make

This will produce several executables, including bulk-insert-and-query.exe, conext-figure5.exe, and conext-table3.exe.

Available Benchmarks

Bulk Insert and Query

Executable: bulk-insert-and-query.exe

This is the most comprehensive benchmark. It reports on bulk insertion and query rates for various filter configurations.

How to Run:

The benchmark takes one argument: the number of randomly generated items to insert.

# Run with 1,580,000 items
./bulk-insert-and-query.exe 1580000

Example Output:

The output is a table comparing different filter types:

CuckooX: Standard Cuckoo Filter with X bits per item.
SemiSortX: Cuckoo Filter with PackedTable (semi-sorting) and X bits per item.
SimdBlock8: A high-performance, SIMD-accelerated block Bloom filter.

                  Million    Find    Find    Find    Find    Find                       optimal  wasted
                 adds/sec      0%     25%     50%     75%    100%       ε  bits/item  bits/item   space
     Cuckoo12       23.78   37.24   35.04   37.17   37.35   36.35  0.131%      18.30       9.58   91.1%
   SemiSort13       11.63   17.55   17.08   17.14   17.54   22.32  0.064%      18.30      10.62   72.4%
      Cuckoo8       35.31   49.32   50.24   49.98   48.32   50.49  2.044%      12.20       5.61  117.4%
    SemiSort9       13.99   22.23   22.78   22.13   23.16   24.06  1.207%      12.20       6.37   91.5%
     Cuckoo16       27.06   36.94   37.12   35.31   36.81   35.10  0.009%      24.40      13.46   81.4%
   SemiSort17       10.37   15.70   15.84   15.78   15.55   15.93  0.004%      24.40      14.72   65.8%
   SimdBlock8       74.22   72.34   74.23   74.34   74.69   74.32  0.508%      12.20       7.62   60.1%

adds/sec: Insertion throughput in millions of items per second.
Find X%: Lookup throughput where X% of queries are for items present in the filter.
ε: Measured false positive rate.
bits/item: Actual bits of memory used per inserted item.
optimal bits/item: Theoretical minimum bits per item for the measured FPR.
wasted space: Percentage of space used beyond the theoretical optimum.

CoNEXT Paper Reproductions

These benchmarks are designed to reproduce specific results from the original CoNEXT 2014 paper.

conext-table3.exe: Reproduces results from "Table 3: Space efficiency and construction speed." It measures the number of items inserted until failure, bits per item, false positive rate, and construction speed.
conext-figure5.exe: Reproduces results from "Figure 5: Lookup performance when a filter achieves its capacity." It measures lookup throughput with varying percentages of queries on existing items.