Core Concepts: Splitting and Merging for Parallelism

Many modern data structure libraries provide "parallel" collections that bundle an execution model (like a thread pool) with the data structure itself. While convenient, this approach tightly couples data and execution, limiting flexibility.

Bifurcan takes a different approach by separating these concerns. It provides the fundamental tools for parallelism—efficient splitting and merging—but leaves the execution model up to you.

The `split()` Method

Every Bifurcan collection implements the split(k) method. This method breaks the collection into approximately k smaller, independent sub-collections.

import io.lacuna.bifurcan.IList;
import io.lacuna.bifurcan.List;

// Create a large list
IList<Long> largeList = List.from(1_000_000, i -> i);

// Split it into roughly 8 parts for processing
IList<? extends IList<Long>> parts = largeList.split(8);

System.out.println("Number of parts: " + parts.size());

Because of Bifurcan's underlying data structures (like Relaxed Radix Balanced Trees for List), this operation is extremely fast, often near-constant time.

Processing and Merging

Once a collection is split, you can process each part in parallel using any execution framework you choose (e.g., java.util.concurrent.ExecutorService, Project Loom virtual threads, etc.).

After processing, the results can be efficiently recombined using methods like:

concat(IList) for lists
merge(IMap, mergeFn), union(IMap), intersection(IMap), and difference(IMap) for maps
union(ISet), intersection(ISet), and difference(ISet) for sets

Example Workflow

Here is a conceptual example of a parallel map-reduce operation using Bifurcan's tools:

import io.lacuna.bifurcan.*;
import java.util.concurrent.*;
import java.util.stream.Collectors;

// 1. Create a large data set
IMap<String, Long> data = new Map<String, Long>().linear();
for (long i = 0; i < 1_000_000; i++) {
    data.put("key_" + i, i);
}
data = data.forked();

// 2. Split the data for parallel processing
IList<? extends IMap<String, Long>> parts = data.split(8);
ExecutorService executor = Executors.newFixedThreadPool(8);

// 3. Process each part in parallel (e.g., sum the values)
Callable<Long> processPart = (IMap<String, Long> part) -> {
    return part.values().stream().reduce(0L, Long::sum);
};

List<Future<Long>> futures = parts.stream()
    .map(part -> executor.submit(() -> processPart.call(part)))
    .collect(Collectors.toList());

// 4. Merge the results
long totalSum = 0;
for (Future<Long> future : futures) {
    totalSum += future.get();
}

executor.shutdown();

By providing efficient, fundamental operations for partitioning and recombination, Bifurcan gives you the flexibility to build complex parallel workflows without being tied to a specific execution model.

Core Concepts: Splitting and Merging for Parallelism

The split() Method

Processing and Merging

Example Workflow

The `split()` Method