Quick Start Guide
This guide provides a minimal "hello world" example to get you started with fetching congressional data. We will download and process data for bills and resolutions.
Prerequisites
Ensure you have completed all the steps in the Installation guide and have activated your Python virtual environment.
Step 1: Download Bill Status Bulk Data
The first step is to download the raw legislative data from GovInfo.gov. The govinfo
command is used for this, specifically targeting the BILLSTATUS
bulk data collection.
Run the following command in your terminal:
usc-run govinfo --bulkdata=BILLSTATUS
This command will download ZIP files containing XML data for all bills in all congresses and place them into the data/
directory, structured by congress number and bill type. For example:
data/117/bills/hr/BILLSTATUS-117hr1.xml
It also creates a cache/
directory to store downloaded pages and metadata to avoid re-downloading unchanged files in the future.
Step 2: Process the Raw Data into JSON and XML
Now that you have the raw XML files from the government, you can process them into a more developer-friendly format using the bills
command.
usc-run bills
This script scans the data/
directory for the BILLSTATUS
files you just downloaded, parses them, and generates two structured output files for each bill:
data.json
: A clean, comprehensive JSON file.data.xml
: A backward-compatible XML file in the GovTrack.us format.
Step 3: Inspect the Output
After the bills
command completes, your data
directory will be populated with the processed files. For example, the data for H.R. 1 in the 117th Congress would be located at:
data/117/bills/hr/hr1/
├── data.json
├── data.xml
└── fdsys_billstatus.xml
data.json
contains the structured bill data.data.xml
contains the same data in the legacy XML format.fdsys_billstatus.xml
is the original raw XML file downloaded in Step 1.
Congratulations! You have successfully downloaded and processed your first set of congressional data.