API Overview

The API is split into two layers:

  1. High-Level API: Functions for common tasks like rotating pages, extracting text, or getting metadata. These handle the complexity of the PDF tree structure for you.
  2. Low-Level API: Functions to manipulate specific PDF objects (Dictionaries, Streams) directly. This is useful for advanced modifications.

Common Patterns

Most functions take a Doc object as the first argument and return a new Doc object.

import pdfsyntax as pdf

doc = pdf.readfile("in.pdf")

# Functional style
doc = pdf.rotate(doc, 90)

# Method style (equivalent)
doc = doc.rotate(90)

The Doc Object

The Doc object is a named tuple containing the document state. You generally treat it as a black box, but it exposes methods matching the high-level API functions.