Graphs can be an important feature of analysis. A graph that has been well designed and put together can make summary statistics much more readable and increase the interpretability. It also makes reports and articles looks more professional.
There are many software packages that are available to design great graphs and charts. This seems to be a constant discussion I have with others over what graphing software should be used. There are some packages out there that simply make a graph. You can even use something like the default Microsoft Excel or Powerpoint. But then you end up with an “Excel Graph”. Whenever I see one of those graphs, sadly, my knee-jerk reaction is to investigate further into the analysis. So there is something to be said about having professional looking graphs. However, with that said given some time and work I can agree that Excel can create some attractive graphs.
There are a lot of packages out there and trying to learn every one of them is almost impossible. My suggestion is to find one that produces a high quality graph and stick with it. There are many commercial software packages including the most common: SPSS, SAS, Minitab. There’s even Photoshop/Illustrator for those who want absolute control over a completely custom graph. I’ll highlight a few (open source) packages that I have used for various purposes here.
1) R — This does, of course, do statistical analysis but it can produce some very high quality graphs. It will produce just about every graph imaginable. And if it doesn’t already do the graph then chances are you can write a function or package to do it. R graphs are built from the ground up. Usually, if you don’t want something in the graph then it won’t be there unless you say otherwise.
2) Gretl — This is a full feature statistical software package that handles a handful of graphs. It’s specifically designed for economics and quite easily performs general statistical tests and even handles iterative estimation (i.e. Monte Carlo). A review of the software package is available at the Journal of Statistical Software. The graphs are quick and easy to make and there is a command line option for more control. The graphs are gnuplot and are of high enough quality that they can be used in publications. The range of graphs are limited but the ones that is does are pretty good: there’s the standard boxplot, scatterplot, and qqplot.
3) Orange — I was working with several large datasets several years ago and looked into using Orange. This is a very powerful data mining package. It also has a graphing module to produce some very good graphs. It produces the standard graphs but it really focuses on the data mining graphs and plots. These include hierarchical clustering, scatterplots, it also provides a survey plot for identifying correlations between two variables.
4) GGobi – I was first introduced to GGobi back in about 2000. It is a great way to visualize multi-dimensional data. It can be a bit mind-boggling at times trying to wrap you head around a 5 dimensional space being projected on a two-dimensional plane. I’ve also used it to help spot outliers in a 4+ dimensional space and to just gain a better understanding of the data.
5) gnuplot – This is strictly a graphing program. And it does a good job at creating them. It will make a wide variety of graphs and has many built-in demos that can be easily modified and rerun on a new dataset. It also makes it really easy to rotate three-dimensional scatterplots by simply clicking and dragging the image. Here is an example from gnuplot.
These are just a few of the software packages that I have used and with which I have had success. There are many graphing packages out there and I would be interested in other programs that others have used to create graphs and charts.