For this week’s KUNC story on lobbyists making the big bucks, I spent considerable time working with data provided by the Colorado Secretary of State via FTP.
Initially, I’m looking at lobbyists, their clients and their income. I’m also working on building some interactive webpages using the data.
To do this, I used Jupyter Notebook, Python and Pandas. Ben Welsh, of the Los Angeles Times, offers some great tutorials for such analysis, and I took his online course and a NICAR course from him earlier this year.
One of the big advantages to this method of analysis is the documentation. You can check my work at this Github repo, and look at the .ipynb files for the code-heavy details.
Or check out these methodology bullet points:
- Names of lobbyists and clients are often in two or three different fields, so they’ve been combined.
- There’s a frustrating combination of ALL UPPERCASE and traditional Title Case in these databases. I’ve switched it all to Title Case, but that’s created its own issues with names that have additional upper-case letters in them.
- These names are also inconsistent. This data is entered by humans, so there are misspellings and often different ways of referring to the same company or firm. To rectify this, I compared lists of lobbyist names and client names, then used code to standardize them – just for fiscal year 2017, the year my KUNC story is looking at. (True confession: There are more than 1,000 clients, and every time I go through the list I find new duplicates.)
- A major caveat with this data is that there are several lobbying firms that report income from lots of clients, then they also report what they pay the firm’s lobbyists. That means there’s some duplication in terms of income in the overall income file. So I went through the list of lobbyists and matched up the lobbyists being paid by their firms, which also reported income from clients. These individual lobbyists and their incomes were deleted from the analysis.
- One lobbyist reported her income as an individual and as an LLC. I removed the duplicates.
- Another big caveat: many lobbyists hire each other as subcontractors. When lobbyists pay subcontractors, that takes money out of their gross income. This, combined with subtracting out lobbying firm payments to their principals brought the total 2017 income down from about $39 million to $30 million in net income.
There are more stories in this data that I’m pursing. If you have ideas or feedback, leave a comment or email fish (at) copolitics.co