Congressional Bills Project

Codebooks and crosswalks


The Variables Codebook

The format of the bills data has changed. In particular, instead of dummy variables for committee of referral, committee membership etc., we have single 'array' variables. Recreating the dummy variables can be easily accomplished in Excel using a search formula such as:

=IF( ISERROR( SEARCH( "102",H2 ) )=TRUE, 0, 1 )

Here '102' is the committee number you are interested in and H2 is the cell you are searching. So if you put this formula in a different column it will populate that column with a 0 or 1 depending on whether 102 is present in H2. etc.

Committee names can be found here.

The Database of Congressional Historical Statistics provides the values for most of the biographical variables (such as party or state code)

Importsnt Bills [updated 4/20/2017]. This filter is based on the presence of certain words in a title (below) and can be used to exclude bills that are arguably of minor importance. For example, bills to name buildings are fairly common and a large proportion of the laws that are passed. We do a good job of excluding these bills. Other bills transfer small plots of land (or buildings from the federal to local governments. We try to identify these with a couple of terms (land exchange and exchange of land) but we also know that there are quite a few bills that ''transfer'' land that we haven't included because the transfer key term ends up including too many false positives. Users of this variable will probably want to use the private bill filter as well (or major topic = 99).

=IF(OR(ISNUMBER(SEARCH({"medal","coin","name","technical correction","stamp","land exchange","suspend temporarily","extend the temporary","boundar","exchange of land"},AA2))),1,0) We then reviewed all bills with "designate" in their titles and tagged those that named buildings.

The Topics Codebook

We label bill titles using the topic coding system of the Policy Agendas Project/ Comparative Agendas Project. In 2014 the PAP system was modified and the codebook updated and we have altered the major and minor topic labels of congressional bills to reflect these changes. The earlier bills codes (oldMajor, oldMinor) are included through the 112th Congress.

The topic coding system is mutually exclusive (only one topic for each bill title). Thus researchers should not assume that every bill relating to 'air pollution' (for example) will be coded as '705.' In addition, topics are based on titles 'as introduced,' and bill titles (and substance) can change (we use short descriptions for the pre-1973 period).

We strive for 90% interannotator reliability at the major topic level, and 80% at the subtopic level during the training process. Annotators train for an academic quarter and then we assign bills to individual annotators.


Committee Membership. The Policy Agendas committee codes we use are different from those used in the Committee membership database maintained by Professor Charles Stewart at MIT. This Committees Crosswalk facilitates efforts to integrate information from the two datasets.

ICPSR - NOMINATE crosswalk. The ICPSR numbers used in some other datasets are notoriously confusing. Two members may have the same ICPSR number, and the same member may have been assigned different numbers in different Congresses. Nevertheless, users may still want to integrate the valuable data contained in those datasets. This Member ID crosswalk can be used to associate old ICPSR numbers with the one used in the bills project - the revised member IDs developed by Keith Poole and Howard Rosenthal as part of the NOMINATE project.














© 2004, University of Washington | Credits Funding Contact Us