CLI Usage
PDFSyntax installs a command-line tool accessible via pdfsyntax. It allows you to inspect and extract data from PDF files without writing Python code.
Syntax
pdfsyntax COMMAND FILE [OPTIONS]
Or executing as a module:
python3 -m pdfsyntax COMMAND FILE [OPTIONS]
Available Commands
| Command | Description |
|---|---|
overview |
Prints general metadata and structure info. |
browse |
Generates an interactive HTML visualization of the PDF structure. |
disasm |
Dumps the internal file structure (objects, offsets) to the terminal. |
text |
Extracts text from the document with spatial layout preservation. |
fonts |
Lists all fonts used in the document. |
compress |
Applies lossless compression to the file. |
hexdump |
Canonical hex and ASCII dump of the file. |
Example: Overview
$ pdfsyntax overview my_document.pdf
# Structure
Version: 1.4
Pages: 10
Revisions: 2
Encrypted: False
...
# Metadata
Title: Annual Report
Author: Jane Doe
...