Overview

This project provides a suite of Python tools designed to collect, structure, and simplify data about the United States Congress. Originally developed by GovTrack.us and the Sunlight Foundation, it is now a community-run project dedicated to providing public domain data on bills, amendments, roll call votes, nominations, and more.

The tools download official data from sources like Congress.gov's bulk data repository and GovInfo.gov, converting it into accessible JSON and XML formats.

Key Features

This toolkit can collect a wide variety of congressional data, including:

  • Bills and Amendments: Downloads official bulk bill status data and converts it into a more user-friendly format.
  • Roll Call Votes: Scrapers for House and Senate roll call votes.
  • Presidential Nominations: A scraper for presidential nominations in Congress.
  • Committee Meetings: Fetches upcoming committee meeting schedules.
  • Official Documents: A powerful fetcher for GovInfo.gov, which holds bill text, statutes, and other official documents, with the ability to download only newly updated files.
  • Statutes at Large: Processes the official compilation of all laws and resolutions.
  • Historical Data: Includes tools to import historical vote data from VoteView and historical bill data from the Adler & Wilkerson Congressional Bills Project.

Data Output

The scrapers cache downloaded pages in a top-level cache directory and output structured data into a top-level data directory.

For each data object (like a bill or vote), two primary files are generated:

  • data.json: A detailed JSON representation of the data.
  • data.xml: An XML version that maintains backward compatibility with the format historically provided by GovTrack.us.

Project Goals

This project aims to provide a modern, reliable, and open-source solution for accessing congressional data. By making this information available in simple, structured formats, it empowers developers, journalists, researchers, and the public to build tools and gain insights into the legislative process.

For more information on the broader context of congressional data, see the Congressional Data Coalition.