CLI Usage

PDFSyntax installs a command-line tool accessible via pdfsyntax. It allows you to inspect and extract data from PDF files without writing Python code.

Syntax

pdfsyntax COMMAND FILE [OPTIONS]

Or executing as a module:

python3 -m pdfsyntax COMMAND FILE [OPTIONS]

Available Commands

Command Description
overview Prints general metadata and structure info.
browse Generates an interactive HTML visualization of the PDF structure.
disasm Dumps the internal file structure (objects, offsets) to the terminal.
text Extracts text from the document with spatial layout preservation.
fonts Lists all fonts used in the document.
compress Applies lossless compression to the file.
hexdump Canonical hex and ASCII dump of the file.

Example: Overview

$ pdfsyntax overview my_document.pdf

# Structure
Version: 1.4
Pages: 10
Revisions: 2
Encrypted: False
...
# Metadata
Title: Annual Report
Author: Jane Doe
...