Ggplot2 graph of counts in different categories from a check all that apply survey question

Recently, I needed to make a graph that displayed the number of survey respondents who were in a set of categories. The data came from a survey question that was a ‘check all that apply’ (not my favorite) so each respondent could belong to one to four categories. The data were in a wide format, meaning that there was a column for each category.

Example data (yes, I know the categories don’t make sense as answers to a check all that apply question, but the point in the same).

Example done in R Studio with R 3.3.2.

library(reshape2)
library(ggplot2)
library(plyr)

df = data.frame(“ID”= seq(1,5,1), “A” = c(NA, “About once a day”, NA, “About once a day”, “About once a day”), 

“B” = c(“About once a week”, NA, NA, “About once a week”, “About once a week”),
“C” = c(NA, “About once a month”, NA, NA, “About once a month”))

My first step was to reshape the data to a long format, with one column for category.

longdf = reshape(df, varying = c(“A”, “B”, “C”), v.names = “Category”,
     timevar = “Category”, times = c(“A”, “B”, “C”), idvar = “ID”, direction = “long”,  sep = “_”)

Then I dropped the level created by the reshape. There are two steps to do this. First you subset the data to have only rows that do not have the unwanted level, in this case it’s “”, an artifact of the reshaped dataframe. (NOTE, I’m sure there is a cleaner way to reshape that doesn’t give you this level, but this worked and I was in a hurry.) After the subset, you have to explicitly drop the unused level using droplevels.

longdf2 <- subset(longdf, Category != “”)
longdf2$Category = droplevels(longdf2$Category)

Now that the data are in the right format it’s time to make the graph.

ggplot(longdf2, aes(x=Category, fill = Category))+  #set x axis to Category, no y needed, use fill = Category to get color by Category
geom_bar()+ #bar graph of counts
theme_bw()+ #set theme to black and white, grey gridlines
ggtitle(“Title”)+ #title
labs(y=”Number of respondents”, x = “”) + #labs on x and y axes
theme(plot.title = element_text(size = 22))+ #title font size
scale_fill_manual(values = c(“cyan1”, “cyan4”, “cornsilk3”))+ #manual set colors
theme(axis.title = element_text(face=”bold”, size = 22)) + #axis title text options, bold, size
theme(axis.text.x = element_text(angle = 50, hjust = 1 ))+ #axis text options for x axis only, adjust to 90 angle so that they don’t overlap
theme(axis.text = element_text(color=1, size=18)) + #axis text option, color and size
theme(legend.position=”none”) #remove legend

And there you have it! A bar graph of the number of respondents in each category.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s