Collections: Rope

A Rope is an immutable, tree-based data structure for representing sequences of characters, similar to a string. It is designed for efficient manipulations of very long strings.

Unlike Java's String, which uses UTF-16 internally, Bifurcan's Rope uses UTF-8 encoding at the leaves of its tree structure. This can lead to more compact storage for text that is primarily ASCII.

Features

  • Efficient Modifications: Operations like insertion, deletion, and concatenation are much faster than with standard Java Strings, which require creating new character arrays for every change. For a Rope, these operations typically involve creating a few small objects and reusing large portions of the existing structure.
  • Dual Indexing: A Rope can be efficiently indexed by both full Unicode code points and Java's UTF-16 code units.
  • CharSequence Compatibility: A Rope can be converted to a java.lang.CharSequence in constant time via toCharSequence(), allowing for easy integration with existing Java APIs.
  • Immutable: Ropes are fully immutable and thread-safe.

Basic Usage

import io.lacuna.bifurcan.Rope;

// Create a rope
Rope r1 = Rope.from("Hello, ");
Rope r2 = Rope.from("world!");

// Concatenate ropes efficiently
Rope r3 = r1.concat(r2);
System.out.println(r3); // "Hello, world!"

// Insert text
Rope r4 = r3.insert(7, "beautiful ");
System.out.println(r4); // "Hello, beautiful world!"

// Remove text
Rope r5 = r4.remove(7, 17); // remove "beautiful "
System.out.println(r5); // "Hello, world!"

// Slice a substring
Rope slice = r3.slice(0, 5);
System.out.println(slice); // "Hello"

When to Use a Rope

A Rope is most beneficial when you are performing many modifications on large strings, such as in a text editor, a compiler, or a document processing system. For simple, static strings, a standard Java String is usually sufficient.