Graphs can provide an excellent way to emphasize a point and to quickly and efficiently show important information. Sadly, poor graphs can be a good way to waste space in an article, take up time in a presentation, and waste a lot of ink all while providing little to no information.

Excel has made it possible to make all sort of graphs. However, just because the graph looks like a spider web or like something you can eat for dessert doesn’t mean you should use it.

This discussion here will show five options on how to graph Likert scale data, will show best/common practice for graphing, and will provide the R code for each graph. These graphing approaches are based on a list that I have compiled that the different people that I have worked with have used to graph and interpret Likert scales within their organization.

Likert scales usually have 5 or 7 response options. However, the exact number and whether there should be an odd or even number of responses is a topic for another psycometric discussion. A typical Likert scale is:

1 Strongly Agree

2 Agree

3 Neutral

4 Disagree

5 Strongly Disagree

For example purposes I generated some random discrete data that is formatted as a Likert scale. I have created three examples to show the extremities of Likert scale responses.

set.seed(1234) library(e1071) probs <- cbind(c(.4,.2/3,.2/3,.2/3,.4),c(.1/4,.1/4,.9,.1/4,.1/4),c(.2,.2,.2,.2,.2)) my.n <- 100 my.len <- ncol(probs)*my.n raw <- matrix(NA,nrow=my.len,ncol=2) raw <- NULL for(i in 1:ncol(probs)){ raw <- rbind(raw, cbind(i,rdiscrete(my.n,probs=probs[,i],values=1:5))) } r <- data.frame( cbind( as.numeric( row.names( tapply(raw[,2], raw[,1], mean) ) ), tapply(raw[,2], raw[,1], mean), tapply(raw[,2], raw[,1], mean) + sqrt( tapply(raw[,2], raw[,1], var)/tapply(raw[,2], raw[,1], length) ) * qnorm(1-.05/2,0,1), tapply(raw[,2], raw[,1], mean) - sqrt( tapply(raw[,2], raw[,1], var)/tapply(raw[,2], raw[,1], length) ) * qnorm(1-.05/2,0,1) )) names(r) <- c("group","mean","ll","ul") gbar <- tapply(raw[,2], list(raw[,2], raw[,1]), length) sgbar <- data.frame( cbind(c(1:max(unique(raw[,1]))),t(gbar)) ) sgbar.likert<- sgbar[,2:6]

**Diverging Stacked Bar Chart**

Diverging stacked bar charts are often the best choice when visualizing Likert scale data. There are various ways to produce these graphs but I have found the easiest approach uses the *HH *package. There are many graphs that can be produced using this package. I have provided three approaches here.

require(grid) require(lattice) require(latticeExtra) require(HH) sgbar.likert<- sgbar[,2:6] likert(sgbar.likert, main='Example Diverging Stacked Bar Chart for Likert Scale', sub="Likert Scale") likert(sgbar.likert, horizontal=FALSE, aspect=1.5, main="Example Diverging Stacked Bar Chart for Likert Scale", auto.key=list(space="right", columns=1, reverse=TRUE, padding.text=2), sub="Likert Scale") likert(sgbar.likert, auto.key=list(between=1, between.columns=2), xlab="Percentage", main="Example Diverging Stacked Bar Chart for Likert Scale", BrewerPaletteName="Blues", sub="Likert Scale")

**Mean Value**

Often researchers will simply take each response option and interpret it as a real number. Using this approach makes it very convenient to calculate the mean value and standard deviation (and confidence intervals). This is particularly useful when working with non-analytical clients. However, this is a controversial issue. Taking this approach requires a lot of statistical assumptions that may not be correct. For starters the response options really need to be equidistant from each other. For example, is the distance from *Strongly Agree* to *Agree* the same distance from *Disagree* to *Strongly Disagree*? This may be true for one question but it might not be true for all questions on a questionnaire. Different questions and question wording are quite likely going to have different distributions. Furthermore, confidence intervals require normality assumptions which may also be incorrect.

What makes matters worse is that if there are 50 respondents and 25 of them mark *Strongly Disagree* and 25 of them mark *Strongly Agree* then the mean will be 3 implying that, on average, the results are neutral. But that clearly does not adequately describe the data. Ultimately, it is up to the statistician to work with the client to use an appropriate method that appropriately conveys the message and that both parties can agree upon.

plot.new(); par(mfrow=c(1,1)); plot(r$group,r$mean, type="o", cex=1, col="blue", pch=16, ylim=c(1,5), lwd=2, , ylab="Mean Value", xlab="Group" , main=paste("Likert Scale Mean Values Example") , cex.sub=.60 , xaxt = "n", yaxt = "n"); axis(1, at=(1:3), tcl = -0.7, lty = 1, lwd = 0.8, labels=TRUE) axis(2, at=(1:5), labels=TRUE, tcl = -0.7, lty = 1) abline(h=c(1:5), col="grey") lines(r$group,r$ll, col='red', lwd=2) lines(r$group,r$ul, col='red', lwd=2) legend("topright", c("Mean","Confidence Interval"), col=c('blue','red'), title="Legend", lty=1, lwd=2, inset = .05)

**Pies and Multiple Pies**

A table is nearly always better than a dumb pie chart; the only worse design than a pie chart is several of them, for then the viewer is asked to compare quantities located in spatial disarray both within and between charts (…). Given their low density and failure to order numbers along a visual dimension, pie charts should never be used. — Edward Tufte

Pie charts are notoriously difficult to convey the information that was intended. As far as pie charts go I don’t ever use them. There are far better ways to visualize data. However, I have heard some people give a reason for using them that are somewhat justified and generally are based on the ‘eye candy’ argument. But as far as creating a graph that both provides information and looks good a 3-D pie chart is probably not the best choice. I debated whether I should even include the R code for the example but to provide full disclosure here’s the code.

my.table <- table(raw[,2][raw[,1]==1]) names(my.table) <- c("Strongly Agree","Agree","Neutral","Disagree","Strongly Disagree") labl <- paste(names(my.table), "\n", my.table, sep="") pie(my.table, labels=labl, main="Example Pie Chart of Likert Scale")

plot.new() num.groups <- length(unique(raw[,1])) par(mfrow=c(1,num.groups)) for(j in 1:num.groups){ my.table <- table(raw[,2][raw[,1]==j]) pie(my.table, labels=labl, main=paste("Example Pie Chart of\nLikert Scale Group ", j)) }

library(plotrix) slices <- my.table names(my.table) <- c("Strongly Agree","Agree","Neutral","Disagree","Strongly Disagree") labl <- paste(names(my.table), "\n", my.table, sep="") pie3D(slices,labels=labl,explode=0.1, main="3D Pie Chart Example")

**Grouped Bar Chart**

This is a nice approach when wanting to look at each group and highlight any particular Likert response option. Here it is easy to see that in Group 2 the *Neutral *option is by far the most common response.

par(mfrow=c(1,1)) barplot(gbar, beside=T, col=cm.colors(5), main="Example Bar Chart of Counts by Group",xlab="Group",ylab="Frequency") legend("topright", names(my.table), col=cm.colors(5), title="Legend", lty=1, lwd=15, inset = .1)

**Divided Bar Chart**

This isn’t a bad approach and quite similar to the diverging stacked bar chart. This approach shows the stacked percent for each category.

library(ggplot2) library(reshape2) names(sgbar) <- c("group","Strongly Agree","Agree","Neutral","Disagree","Strongly Disagree") mx <- melt(sgbar, id.vars=1) names(mx) <- c("Group","Category","Percent") ggplot(mx, aes(x=Group, y=Percent, fill=Category)) + geom_bar(stat="identity")

Pingback: Your Questions About Confidence Interval | Find Love Today

Thanks for the nice post. I have been interested in ways of visualizing likert items and have created an R package to analyze and visualize such items. You can find a brief overview here: http://jason.bryer.org/likert/

-Jason

I recently had a few posts on Likert scale visualization myself. I prefer a variation on the Diverging Stack Bar– the Net-Stacked Likert.

I feel that including the neutral responses actually makes it harder to tell whether or not a question illicited a directional response because I’m not good at comparing size/area with offsets on both sides. Addtionally, it makes it very difficult for me to judge how much total positive (or negative) response there is quickly. So I exclude the neutrals in almost all cases. My post here:

http://blog.jasonpbecker.com/blog/2012/07/10/ranked-likert-scale-visualization/

I’m going to take a look under the hood at Jason Breyer’s package. I’m really interested in how to better generalize my custom functions so I can be a bit more DRY across projects.

Readers of this post will find this 2011 article of interest: http://www.amstat.org/sections/srms/proceedings/y2011/Files/300784_64164.pdf

Naomi, thanks for posting that link. Your work on the HH package is greatly appreciated. The link provides some great information beyond what I’ve provided on plotting Likert scales.

Thank you for htis post. It is very useful. I was looking around for R packages built specifically for Likert type of data, but the only one I found was this one by Monash University, but am not able to install it (it’s not available on CRAN and the RStudio installer says it is not a bin package). In any case, I found your post useful. Might as well use this instead of looking for a specific package. Thanks again!

After reading this, i have noticed that a likert scale response options can also be 4 not necessary 5 and 9. correct?

That’s correct. Likert scales generally range anywhere from 2 to 10. Whether the the number of responses is odd or even (i.e. having a neutral middle point or not) is a different psychometric discussion.

What about cluster analysis with likert scale data? How could it be done?

You would use Latent Class Models. I have a pdf write up that I need to re-post as more of a blog post at http://statistical-research.com/latent-class-model/ that is a very basic example.

Is it possible to define a different colour scheme for each bar in a diverging stacked bar charts done with HH (likert) ?

Not that I’m aware of in the HH package. You can certainly change the global color scheme to be any set of colors you want. Changing the the color theme for each of the (in the example above) three groups would probably make the graph more difficult to read as the same colors make each group directly comparable and you would only need one legend to identify the category.

An option could be to combine three distinct plots but use

par(mfrow=c(3,1))and only create one group per plot. This way you can stack the plots on top of each other and make it look like one continuous graph and you can handle each plot any way you want.Pingback: Some Common Approaches for Analyzing Likert Scales and Other Categorical Data | Statistical Research

Thanks Wesley for the advise regarding different colour schemes for bars in a likert graph done with HH.

Pingback: Categorical | Pearltrees

How about QQ-plots of 5 point Likert scale data to test normality of the data using Stata. It seems weird as the Likert-scale generates discrete data and the normal distribution is continuous. However I generated some plots that look like a staircase. My question now is: how do you interpret these graphs?

You would think pasting in your exact code would work. Nooooo, welcome to the pedantic world of R where learning is a well kept secret. What a stupid language – where the simple is absurdly complex.

require(grid)

require(lattice)

require(latticeExtra)

require(HH)

sgbar.likert<- sgbar[,2:6]

likert(sgbar.likert,

main='Example Diverging Stacked Bar Chart for Likert Scale',

sub="Likert Scale")

likert(sgbar.likert, horizontal=FALSE,

aspect=1.5,

main="Example Diverging Stacked Bar Chart for Likert Scale",

auto.key=list(space="right", columns=1,

reverse=TRUE, padding.text=2),

sub="Likert Scale")

likert(sgbar.likert,

auto.key=list(between=1, between.columns=2),

xlab="Percentage",

main="Example Diverging Stacked Bar Chart for Likert Scale",

BrewerPaletteName="Blues",

sub="Likert Scale")