Bifurcan: Impure Functional Data Structures
Bifurcan is a Java library providing high-quality implementations of both mutable and immutable data structures. These structures share a common API and are built on a set of core design principles:
- Efficient Random Access: Quickly access any element by its index.
- Efficient Inverted Indices: All collections (except lists) provide an
indexOf
method for fast lookups. - Efficient Splitting and Merging: Designed for parallel processing by allowing collections to be easily broken apart and recombined.
- Customizable Equality Semantics: Define your own hashing and equality logic for keys and values.
- Contiguous Memory Usage: Wherever possible, memory is laid out contiguously to improve iteration performance.
- High Performance: Aims to be equivalent to, or better than, existing alternatives on the JVM.
Rather than using the standard java.util
interfaces, Bifurcan provides its own functional interfaces: IList
, IMap
, and ISet
. Each update operation on a collection returns a new collection instance, preserving the original. For interoperability, every Bifurcan collection can be coerced to a read-only version of its standard Java counterpart via toList()
, toMap()
, or toSet()
.
Key Features
A Rich Collection of Data Structures
Bifurcan offers a comprehensive set of collections to cover a wide range of use cases:
- Lists:
List
(immutable) andLinearList
(mutable) implementing a relaxed-radix tree structure for near-constant time slices and concatenation. - Hash Maps:
Map
(immutable) andLinearMap
(mutable) with customizable equality and hashing.LinearMap
can be up to 20x faster for iteration thanjava.util.HashMap
. - Hash Sets:
Set
(immutable) andLinearSet
(mutable) built upon their map counterparts. - Sorted Maps:
SortedMap
(immutable red-black tree),IntMap
(a specialized sorted map for integer keys), andFloatMap
. - Graphs: Immutable implementations for
Graph
,DirectedGraph
, andDirectedAcyclicGraph
. - Rope: An immutable, tree-based sequence of Unicode characters with efficient indexing and UTF-8 encoding.
Core Concepts
Bifurcan is built around a few powerful ideas that enable its performance and flexibility.
-
Linear vs. Forked Collections: Bifurcan introduces a novel approach to mutability. Collections can be temporarily made mutable (
linear
) for efficient batch updates within a single scope, and then made immutable (forked
) for safe sharing. This offers the performance of mutable data structures with the safety of immutable ones. -
Virtual Collections: You can create collections that are not backed by in-memory data but by functions. This allows for the creation of massive, logically-defined collections with minimal memory footprint.
-
Splitting and Merging: The library is designed to facilitate parallel processing. Any collection can be split into smaller, independent pieces that can be processed on multiple cores and then efficiently merged back together.
Why Bifurcan?
While many functional data structure libraries exist for the JVM, Bifurcan stands out due to its unique features and relentless focus on performance. As shown in the detailed performance comparison, Bifurcan is equivalent to the best existing implementations for basic operations and significantly outperforms them in batch operations like union
, intersection
, and difference
.
This performance gain comes from complex, optimized algorithms that are validated by extensive generative testing, ensuring both speed and correctness.