22 September 2012

Federal Register API/R Package Ideas?

The other day Critical Juncture put up an API for the Federal Register. I thought it would be great if there was a package that could use this API to download data directly into R (much like the excellent WDI package).

This would make it easier to analyse things like:

  • the frequency of regulations issued on a particular issue over a given period of time,

  • the text of the actual regulations.

The nice people over at Critical Juncture tweeted me showing interest in the idea and wondering what would be useful.

I was thinking that in the package there could be commands such as getFedRegister and getMultiFedRegister that would do pretty much do what the API is set up to help now, except download the data into an R object rather than straight to JSON or CSV.

More Ideas?

Any other ideas for things that might be useful? Just leave them in the comments at my Tumblr site.

17 September 2012

Over at the miscellaneous blog . . .

Posts I recently put up at my miscellaneous things blog:

  1. Dutch election results mapped!
  2. Research on optical illusions and primary visual cortex size.

Create Beamer/knitr Lecture Slideshow with Bash, Explain the Script with knitr

Setting up a beamer slideshow is tedious. Creating new slideshows with the same header/footer/style files every week for your course lectures is very very tedious.

To solve this problem I created a simple bash shell script. When you run the script in your terminal it asks whether you want to create a "Lecture" or "Seminar" and what number you want it to have. Then it does the rest.

You can find the script and all of the necessary files here.

To create the README file I used knitr version 0.8's new engine='bash' option. This allows you to knit bash code into your Markdown file the same what you would R code. It's pretty simple. See the R Markdown file for more details.

Please feel free to take and modify the files. Also, if you can help streamline them that would be great.

Oh kind of related tip: If you want a bash command to show up over more than one line in your knitted document place a backslash (\) at the end of the line.

The beamer theme I use is based on something I hammered together awhile ago. See this post for more details.

6 September 2012

Graphically Comparing Confidence Intervals From Different Models

In a recent paper on Federal Reserve inflation forecast errors (summary blog post, paper) I wanted a way to easily compare the coefficients for a set of covariates (a) estimated from different types of parametric models using (b) matched and non-matched data.

I guess the most basic way to do this would be to have a table of columns showing point estimates and confidence intervals from each estimation model. But making meaningful comparisons with this type of table would be tedious.

What I ended up doing was creating a kind of stacked caterpillar plot. Here it is:

Comparing 95% Confidence Intervals
Comparing 95% Confidence Intervals (Gandrud & Grafström)

I think this plot lets you clearly and quickly compare the confidence intervals estimated from the different models. I didn't include the coefficient point estimates because I was most interested in comparing the ranges. The dots added too much clutter.

I have a link to the full replication code at the end of the post, but these are the basic steps:

  1. I estimated the models using MatchIt and Zelig as per Ho et al. (2007). I created new objects from the results of each model.

  2. I used the confint command to find the 95% confidence intervals.

  3. I did some cleaning up and rearranging of the confidence intervals, mostly using Hadley Wickham's melt function in the reshape package. The basic idea is that to create the plots I needed a data set with columns for the coefficient names, the upper and lower confidence interval bounds, what parametric model the estimates are from, and whether the data set was matched or not. I removed the Intercept and sigma2 estimates for simplicity.

  4. I made the graph using ggplot2. The key aesthetic decisions that I think make it easier to read are: (a) making the lines a bit thicker and (b) making the bands transparent. I liked making the bands transparent and stacking them rather than showing different lines for each set of estimates because this halved the number of lines in the plot. Makes it much crisper.

The full code for replicating this figure is on GitHub Note: this code depends on objects that are created as the result of analyses run using other source code files also on the GitHub site.