## 26 April 2012

### Graphing Predicted Legislative Violence with Zelig & ggplot2

Update 18 January 2013: This example works for Zelig versions < 4. For versions from 4 you will likely have to use Zelig's simulation.matrix command to extract the matrix of expected values.
In my previous post I briefly mentioned an early draft of a working paper (HERE) I've written that looks into the possible causes of violence between legislators (like the violence shown in this picture from the Turkish Parliament).
 From The Guardian

In this post I'm going to briefly discuss how I used Zelig's rare events logistic regression (relogit) and ggplot2  in R to simulate and plot the legislative violence probabilities that are in the paper. In this example I am plotting simulated probabilities at fitted values on three variables:

• Age of democracy (Polity IV > 5)
• A dichotomous electoral proportionality variable where 1 is high proportionality, 0 otherwise (see here for more details)
• Governing parties' majority (as a % of total legislative seats. Data is from DPI.)

Background

I used King and Zeng's rare events logistic regression which they include in their R package Zelig to study incidences of legislative violence because (a) I was interested in a dichotomous outcome--whether or not a legislature had an incident of violence in a given year and (b) fortunately legislative violence is fairly rare. There are only 88 incidences in my data set spanning 1981 to Spring 2011 and even fewer (72) when I constricted the sample to 1981-2009, because there is limited data on many of my dependent variables after 2009.

Why GGPLOT2?

If you are familiar with the Zelig package, you'll know that it already includes a capability to both simulate quantities of interest (for me it's probabilities of violence given various values of the covariates) and plot the results from these simulations with uncertainty estimates.

To do this, first run the basic Zelig model then use setx() to set the range of covariate fitted values you are interested predicting probabilities for (all others are set to their means by default). Then use sim() to simulate the quantities of interest. Finally, just use plot() on the Zelig object that  sim() creates. (See the full code at the end of the post.)

However these plots are . . . not incredibly visually appealing.  Here is an example with various ages of democracy:

Plus, if you are not using base R plots in the rest of your paper, these types of plots will clash.

I used ggplot2 graphs in the rest of the paper so I wanted a way to plot simulated probabilities with ggplot2.  Basically I wanted this:

Using GGPLOT2 and Zelig Simulation Output.

Once you have the Zelig object returned from sim() it is simply a matter of extracting the simulation results. The default is to run 1,000 simulations for each fitted value. Zelig stores these in qi$ev in the Zelig object. In this example the fitted values of democratic age (fitted at years 0 through 85) are in a Zelig simulation object that I called Model.DemSim. To extract the simulations of the predicted probabilities use: # Extract expected values from simulations Model.demAge.e <- (Model.DemSim$qi)
Now turn the object Model.demAge.e into a data frame and use melt() from Reshape2 to reshape the data so that you can use it in ggplot2.
# Create data.frame
Model.demAge.e <- data.frame(Model.demAge.e$ev)  # Melt data Model.demAge.e <- melt(Model.demAge.e, measure = 1:86) Since the numbers in Variable actually mean something (years of democracy) the final cleanup stage is to remove the “X” prefixes attached to Variable. # Remove 'X' from variable Model.demAge.e$variable <- as.numeric(gsub("X", "", Model.demAge.e\$variable))
Now we can use Model.demAge.e as the source of data for geom_point() and stat_smooth() in ggplot! You might want to drop simulation results outside the 2.5 and 97.5 percentiles to keep only the middle 95%. The red bars in the Zelig base plots represent the middle 95%. Right now I prefer keeping all of the simulation results and simply changing the alpha (transparency) of the points. This allows us to see all of the results, both outliers and those within widely accepted, but still somewhat arbitrary 95% bounds.

Here is the full code for completely reproducing the last plot above (which is also in the working paper). The last thing to mention is that subsetting the data with complete.cases() to keep only observations with full data on all variables is a crucial step to make before running zelig().

## 22 April 2012

### Causes of Legislative Violence

Update (November 2013): Following really helpful comments from a number of people, I've refined the theory further. Please see the most updated version of the paper at SSRN.
Update (6 May 2012): I've updated the framework a little since I first wrote this post. I've refined the focuses on majoritarian vs. consensual systems rather than fairness. Please see the most updated version of the paper at SSRN.

If you've ever wondered why physical fights sometimes break out between legislators, like the one in the above picture from the Ukrainian parliament, you might be interested in a working paper that I just put together. The working paper is called:

"Two Sword Lengths: Losers' Consent and Violence in National Legislatures".

The title refers to the rumor that in the UK House of Commons the government and opposition benches are two sword lengths apart to prevent actual duels.