Quick Start Guide

This guide provides a minimal "hello world" example to get you started with fetching congressional data. We will download and process data for bills and resolutions.

Prerequisites

Ensure you have completed all the steps in the Installation guide and have activated your Python virtual environment.

Step 1: Download Bill Status Bulk Data

The first step is to download the raw legislative data from GovInfo.gov. The govinfo command is used for this, specifically targeting the BILLSTATUS bulk data collection.

Run the following command in your terminal:

usc-run govinfo --bulkdata=BILLSTATUS

This command will download ZIP files containing XML data for all bills in all congresses and place them into the data/ directory, structured by congress number and bill type. For example:

data/117/bills/hr/BILLSTATUS-117hr1.xml

It also creates a cache/ directory to store downloaded pages and metadata to avoid re-downloading unchanged files in the future.

Step 2: Process the Raw Data into JSON and XML

Now that you have the raw XML files from the government, you can process them into a more developer-friendly format using the bills command.

usc-run bills

This script scans the data/ directory for the BILLSTATUS files you just downloaded, parses them, and generates two structured output files for each bill:

data.json: A clean, comprehensive JSON file.
data.xml: A backward-compatible XML file in the GovTrack.us format.

Step 3: Inspect the Output

After the bills command completes, your data directory will be populated with the processed files. For example, the data for H.R. 1 in the 117th Congress would be located at:

data/117/bills/hr/hr1/
├── data.json
├── data.xml
└── fdsys_billstatus.xml

data.json contains the structured bill data.
data.xml contains the same data in the legacy XML format.
fdsys_billstatus.xml is the original raw XML file downloaded in Step 1.

Congratulations! You have successfully downloaded and processed your first set of congressional data.