## Brief Thoughts on Data Entry: Excel Should Not to Be Used

Over the years I have worked with many people who do the field work to collect data. They have their process and it works for them. However, their work rarely extends beyond data collection. This often means data is collected in Excel. If an observation meets a specific criterion then they are colored yellow, another criterion and they are blue.  If there was a problem with the observation then it is highlighted in bold. This is a very easy approach. It takes very little preparation and one can sling data around like it’s nobody’s business.

When analyzing the data dealing with this on a small scale usually isn’t too difficult. However, when working on a data warehouse sized dataset this becomes an extremely tedious task (if not impossible).  The data scrubbing and preparation can be extremely difficult.  Take for example a simple yes or no question.  Suppose there are four people are entering data about whether soil samples contain a specific contaminate.  Each person will begin entering their data and one person enters ‘Y’/’N’, another enters ‘Yes’/’No’, yet another enters ‘y’/’n’.  Worse yet the fourth person enters 1 for yes and 2 for no.  All of the sudden there are a total of eight codes when there should really only be two.

Solutions

So some extra work in advance.  It’s well worth the time spent to prepare for data collection and have the data entered cleanly rather than trying to make sense of messy data after the fact.

There are a number of solutions that will work depending on the needs of those entering the data.  I mention a few here

• Custom developed web application.  It take some work and potentially additional development costs but it can be completely customized to the project.  I have often taken this approach because it allows me to have a completely customized application for the job.
• Open source client software.  There are many open source software packages that can be used for data entry.  An example of this is EpiData or EpiInfo.
• Commercial software like Qualtrics.  It’s a little pricier but these software packages offer a lot of features.
• Worst case: use SurveyMonkey

## Y2K38: Our Own Mayan Calendar…Again

It’s not quite the end of the world as we know it.  We made it through December 21, 2012 unscathed. It’s not going to be the last time we will make it through such a pseudo-calamity.  After all we have built our own end of the world before (e.g. Y2K).

Next up January 19, 2038.  We’ve unknowingly made this the date that time will stop on all (well just 32-bit systems) computers.  Similar to the Y2K problem this is a problem due to the lack of available integers.  On January 19, 2038 3:14:07 (GMT) it is going to be exactly 2,147,483,647 seconds since the beginning of time (Unix epoch is January 1, 1970).  32-bit systems will only support 2,147,483,647 ($2^{32-1}-1$). It is 32-1 minus because the indexing begins at 0) as its largest integer and since Unix time is measured in seconds that means 2147483647 + 1 will generate an error and fail to calculate the correct date and time.  The most obvious solution will be to upgrade to 64-bit hardware.  This upgrade should buy us some more time and we won’t have to worry about another upgrade for another 9223372036854775808 seconds (Sunday, December 4, 292277026596).  However, since years are often still computed in 32-bit int (even in 64-bit systems) we’re going to loose roughly 290 millions years.  Ultimately, we’ll need to start worrying about an upgrade somewhere in the year 2147483647 (or later if the years were originally configured to start at year 1900).  Let’s hope by then someone can figure out how to deal with really big numbers.

Fortunately for users of R (and some other programs)  this is not necessarily an issue and there are ways around the maximum integer constraint problem. We’re not going to first encounter this problem in 2038. However, it’s possible that it has already exhibited itself in some software that needs to calculate future dates beyond January 19, 2038. An example would be if one needs a predictive time series model extending beyond January 19, 2038.

If you are on a 32-bit system you can give the following lines of code a try as a proof of concept:


as.POSIXlt( as.Date("1970-01-01") )+2147483647

>[1] "2038-01-19 03:14:07 UTC"

as.POSIXlt( as.Date("1970-01-01") )+2147483648

[1] "2038-01-19 03:14:08 UTC"



This can be extended even farther and it can be seen that the year is only limited by four digits.  After the year 9999-12-31 it becomes 0000-01-01


as.POSIXlt( as.Date("9999-12-31") )+86399

>[1] "9999-12-31 23:59:59 UTC"

as.POSIXlt( as.Date("9999-12-31") )+86401

>[1] "0000-01-01 00:00:01 UTC"



In the end it sounds like the system administrator for the Mayan calendar was probably just laid off during the very first financial crises and it was just long overdue for all of the software and hardware updates.

## True Significance of a T Statistic

The example is more of a statistical exercise that  shows the true significance and the density curve of simulated random normal data.  The code can be changed to generate data using either a different mean and standard deviation or a different distribution altogether.

This extends the idea of estimating pi by generating random normal data to determine the actual significance level. First, there is a simple function to calculate a pooled t statistic. Then repeat that function N times. The second part of the R code will produce a density graph of the t statistic (normalized to 1). By changing the distribution, mean, standard deviation one can see what happens to the true significance level and the density graph.  Just keep that in mind the next time someone assumes a standard normal distribution for hypothesis testing when it’s really a different distribution altogether and they are set on alpha = 0.05

###
#Function to calculate the t statistic
###
tstatfunc=function(x,y){
m <- length(x) n <- length(y) spooled <- sqrt(((m-1)*sd(x)^2+(n-1)*sd(y)^2)/(m+n-2)) tstat <- (mean(x)-mean(y))/(spooled*sqrt(1/m+1/n)) return(tstat) } ### #Set some constants ### alpha <- .05 m <- 15 n <- 15 N <- 20000 n.reject <- 0 ### #Iterate N times ### for (i in 1:N){ x <- rexp(m,1) y <- rexp(n,1) t.stat <- tstatfunc(x,y) if (abs(t.stat)>qt(1-alpha/2,n+m-2)){
n.reject <- n.reject+1 } } true.sig.level <- n.reject/N true.sig.level ### #Function to simulate t-statistic vector ### tsim=function(){ tstatfunc(rnorm(m,mean=0,sd=1), rnorm(n,mean=0,sd=1)) } ### #Set up the values to graph ### tstat.vec <- replicate(10000, tsim()) dens.y <- density(tstat.vec)$y/max(density(tstat.vec)$y) dens.x <- density(tstat.vec)$x ### #Graph the density for each ### plot( NULL, NULL, xlim=c(-5,5),ylim=c(0,1), main="Simulated and True Density" ) lines(dens.x,dens.y, lty=1, col=3, lwd=3) curve( dt(x,df=n+m-2)/max(dt(x,df=n+m-2)),add=TRUE ) [/sourcecode] ## Estimating Pi Recently I’ve been working on some jackknife and bootstrapping problems. While working on those projects I figured it would be a fun distraction to take the process and estimate pi. I’m sure this problem has been tackled countless times but I have never bothered to try it using a Monte Carlo approach. Here is the code that can be used to estimate pi. The idea is to generate a 2 by 2 box and then draw a circle inside of it. Then start randomly selecting points inside the box. This example uses 10,000 iterations using a sample size of 500. An interesting follow-up exercise would be to change the sample size, mean, and standard deviation to and see what it does to the jackknife variance. library(bootstrap); nsims <- 10000; size <- 500; alpha <- .05; theta <- function(x){sd(x)} pi.estimate <- function(nsims,size,alpha){ out <- as.data.frame(NA); call <- match.call(); pi.sim <- matrix(NA, nrow=nsims); for(i in 1:nsims){ x <- runif(size,-1,1); y <- runif(size,-1,1); radius <- -1; x.y <- cbind(x,y); d.origin <- sqrt(x^2+y^2); x.y.d <- cbind(x.y,d.origin); est.pi <- 4*length(x.y.d[,3][x.y.d[,3]<1])/ length(x.y.d[,3]) pi.sim[i] <- est.pi; } pi.mean <- mean(pi.sim); jk <- jackknife(c(pi.sim), theta); jk.me <- jk$jack.se*qnorm(1-alpha/2); pi.mean <- pi.mean; pi.se <- jk$jack.se; jk.me <- jk.me; pi.me.u <- pi.mean+jk.me; pi.me.l <- pi.mean-jk.me; return(list(pi.mean=pi.mean, pi.se=pi.se,pi.me.u=pi.me.u,pi.me.l=pi.me.l, call=call)); } pi.estimate(nsims,size,alpha); [/sourcecode] ## Mean Value from Grouped Data Occasionally, I will get requests from clients to calculate the mean. Most of the time it’s a simple request but from time-to-time the data was originally from grouped data. A common approach is to take the midpoint of each of the groups and just assume that all respondents within that group average out to the midpoint. The biggest problem I have run into has been trying to decide what to do with two extremes. What does one do with the age ’65+’ category or the ‘over$100,000’ category.  Since there is really no mid-point one must make a (somewhat arbitrary) determination on what value to use for those categories.  Over the years I have toyed around with various approaches but have found using this Monte Carlo approach has worked the best and seems to be most defensible.  In this example I use a sample of voters who were asked their age and were given categorical response options. The final deliverable will be the average age and standard deviation.

 Group Frequency 18 to <30 25 30 to <45 43 45 to <60 42 60+ 31

I have written some code that I have traditionally used for this type of work but have since found that the LearnBayes package contains the necessary functions as well.  For the purpose of making this code usable in other settings I have combined some of the code I have written with some of the functions from the LearnBayes package.  This way I have been able to modify some of the algorithms and graphs to work under a variety of settings.

I welcome any comment on other approaches people have taken to solve similar problems using grouped, categorical data.

###Note that this function is also available through the VGAM package.
###However, it is displayed here as part of the example.
library(graphics);

## Plotting Likert Scales

Graphs can provide an excellent way to emphasize a point and to quickly and efficiently show important information. Sadly, poor graphs can be a good way to waste space in an article, take up time in a presentation, and waste a lot of ink all while providing little to no information.

Excel has made it possible to make all sort of graphs. However, just because the graph looks like a spider web or like something you can eat for dessert doesn’t mean you should use it.

This discussion here will show five options on how to graph Likert scale data, will show best/common practice for graphing, and will provide the R code for each graph. These graphing approaches are based on a list that I have compiled that the different people that I have worked with have used to graph and interpret Likert scales within their organization.

Likert scales usually have 5 or 7 response options. However, the exact number and whether there should be an odd or even number of responses is a topic for another psycometric discussion. A typical Likert scale is:

1 Strongly Agree
2 Agree
3 Neutral
4 Disagree
5 Strongly Disagree

For example purposes I generated some random discrete data that is formatted as a Likert scale.   I have created three examples to show the extremities of Likert scale responses.


set.seed(1234)
library(e1071)
probs <- cbind(c(.4,.2/3,.2/3,.2/3,.4),c(.1/4,.1/4,.9,.1/4,.1/4),c(.2,.2,.2,.2,.2))
my.n <- 100
my.len <- ncol(probs)*my.n
raw <- matrix(NA,nrow=my.len,ncol=2)
raw <- NULL
for(i in 1:ncol(probs)){
raw <- rbind(raw, cbind(i,rdiscrete(my.n,probs=probs&#91;,i&#93;,values=1:5)))
}

r <- data.frame( cbind(
as.numeric( row.names( tapply(raw&#91;,2&#93;, raw&#91;,1&#93;, mean) ) ),
tapply(raw&#91;,2&#93;, raw&#91;,1&#93;, mean),
tapply(raw&#91;,2&#93;, raw&#91;,1&#93;, mean) + sqrt( tapply(raw&#91;,2&#93;, raw&#91;,1&#93;, var)/tapply(raw&#91;,2&#93;, raw&#91;,1&#93;, length) ) * qnorm(1-.05/2,0,1),
tapply(raw&#91;,2&#93;, raw&#91;,1&#93;, mean) - sqrt( tapply(raw&#91;,2&#93;, raw&#91;,1&#93;, var)/tapply(raw&#91;,2&#93;, raw&#91;,1&#93;, length) ) * qnorm(1-.05/2,0,1)
))
names(r) <- c("group","mean","ll","ul")

gbar <- tapply(raw&#91;,2&#93;, list(raw&#91;,2&#93;, raw&#91;,1&#93;), length)

sgbar <- data.frame( cbind(c(1:max(unique(raw&#91;,1&#93;))),t(gbar)) )

sgbar.likert<- sgbar&#91;,2:6&#93;

&#91;/sourcecode&#93;

<strong>Diverging Stacked Bar Chart</strong>

Diverging stacked bar charts are often the best choice when visualizing Likert scale data.  There are various ways to produce these graphs but I have found the easiest approach uses the <em>HH </em>package.  There are many graphs that can be produced using this package. I have provided three approaches here.

require(grid)
require(lattice)
require(latticeExtra)
require(HH)
sgbar.likert<- sgbar&#91;,2:6&#93;
likert(sgbar.likert,
main='Example Diverging Stacked Bar Chart for Likert Scale',
sub="Likert Scale")

likert(sgbar.likert, horizontal=FALSE,
aspect=1.5,
main="Example Diverging Stacked Bar Chart for Likert Scale",
auto.key=list(space="right", columns=1,
sub="Likert Scale")

likert(sgbar.likert,
auto.key=list(between=1, between.columns=2),
xlab="Percentage",
main="Example Diverging Stacked Bar Chart for Likert Scale",
BrewerPaletteName="Blues",
sub="Likert Scale")

&#91;/sourcecode&#93;
<p style="text-align: center;"><a href="http://statistical-research.com/wp-content/uploads/2012/12/divergingstacked_h.likert.png"><img class="aligncenter  wp-image-859" title="Diverging Stacked Bar Chart for Likert Scale" src="http://statistical-research.com/wp-content/uploads/2012/12/divergingstacked_h.likert.png" alt="" width="452" height="256" /></a></p>
<p style="text-align: center;"><a href="http://statistical-research.com/wp-content/uploads/2012/12/divergingstacked_v.likert.png"><img class="aligncenter  wp-image-860" title="Diverging Stacked Bar Chart (Vertical) for Likert Scale" src="http://statistical-research.com/wp-content/uploads/2012/12/divergingstacked_v.likert.png" alt="" width="502" height="284" /></a></p>
<p style="text-align: center;"><a href="http://statistical-research.com/wp-content/uploads/2012/12/divergingstacked_h_blue.likert.png"><img class="aligncenter  wp-image-861" title="Diverging Stacked Bar Chart for Likert Scale Using Blue Tones" src="http://statistical-research.com/wp-content/uploads/2012/12/divergingstacked_h_blue.likert.png" alt="" width="452" height="256" /></a></p>
<strong>Mean Value</strong>

Often researchers will simply take each response option and interpret it as a real number.  Using this approach makes it very convenient to calculate the mean value and standard deviation (and confidence intervals). This is particularly useful when working with non-analytical clients.  However, this is a controversial issue. Taking this approach requires a lot of statistical assumptions that may not be correct. For starters the response options really need to be equidistant from each other. For example, is the distance from <em>Strongly Agree</em> to <em>Agree</em> the same distance from <em>Disagree</em> to <em>Strongly Disagree</em>? This may be true for one question but it might not be true for all questions on a questionnaire. Different questions and question wording are quite likely going to have different distributions. Furthermore, confidence intervals require normality assumptions which may also be incorrect.

What makes matters worse is that if there are 50 respondents and 25 of them mark <em>Strongly Disagree</em> and 25 of them mark <em>Strongly Agree</em> then the mean will be 3 implying that, on average, the results are neutral.  But that clearly does not adequately describe the data. Ultimately, it is up to the statistician to work with the client to use an appropriate method that appropriately conveys the message and that both parties can agree upon.

plot.new(); par(mfrow=c(1,1));

plot(r$group,r$mean, type="o", cex=1, col="blue", pch=16,
ylim=c(1,5), lwd=2,
, ylab="Mean Value", xlab="Group"
, main=paste("Likert Scale Mean Values Example")
, cex.sub=.60
, xaxt = "n", yaxt = "n");
axis(1, at=(1:3), tcl = -0.7, lty = 1, lwd = 0.8, labels=TRUE)
axis(2, at=(1:5), labels=TRUE, tcl = -0.7, lty = 1)
abline(h=c(1:5), col="grey")
lines(r$group,r$ll, col='red', lwd=2)
lines(r$group,r$ul, col='red', lwd=2)

legend("topright", c("Mean","Confidence Interval"),
col=c('blue','red'), title="Legend",
lty=1, lwd=2,
inset = .05)



Pies and Multiple Pies

A table is nearly always better than a dumb pie chart; the only worse design than a pie chart is several of them, for then the viewer is asked to compare quantities located in spatial disarray both within and between charts (…). Given their low density and failure to order numbers along a visual dimension, pie charts should never be used. — Edward Tufte

Pie charts are notoriously difficult to convey the information that was intended.  As far as pie charts go I don’t ever use them.  There are far better ways to visualize data.  However, I have heard some people give a reason for using them that are somewhat justified and generally are based on the ‘eye candy’ argument.  But as far as creating a graph that both provides information and looks good a 3-D pie chart is probably not the best choice.  I debated whether I should even include the R code for the example but to provide full disclosure here’s the code.


my.table <- table(raw&#91;,2&#93;&#91;raw&#91;,1&#93;==1&#93;)

names(my.table) <- c("Strongly Agree","Agree","Neutral","Disagree","Strongly Disagree")
labl <- paste(names(my.table), "\n", my.table, sep="")
pie(my.table, labels=labl, main="Example Pie Chart of Likert Scale")

&#91;/sourcecode&#93;
<p style="text-align: center;"><a href="http://statistical-research.com/wp-content/uploads/2012/12/pie.likert.png"><img class="aligncenter  wp-image-852" title="Pie Chart for Likert Scale" src="http://statistical-research.com/wp-content/uploads/2012/12/pie.likert.png" alt="" width="420" height="262" /></a></p>

plot.new()
num.groups <- length(unique(raw&#91;,1&#93;))
par(mfrow=c(1,num.groups))
for(j in 1:num.groups){
my.table <- table(raw&#91;,2&#93;&#91;raw&#91;,1&#93;==j&#93;)
pie(my.table, labels=labl, main=paste("Example Pie Chart of\nLikert Scale Group ", j))
}

&#91;/sourcecode&#93;
<p style="text-align: center;"> <a href="http://statistical-research.com/wp-content/uploads/2012/12/piemulti.likert1.png"><img class="aligncenter  wp-image-858" title="Multiple Pie Charts for Likert Scale" src="http://statistical-research.com/wp-content/uploads/2012/12/piemulti.likert1.png" alt="" width="502" height="284" /></a></p>

library(plotrix)
slices <- my.table
names(my.table) <- c("Strongly Agree","Agree","Neutral","Disagree","Strongly Disagree")
labl <- paste(names(my.table), "\n", my.table, sep="")
pie3D(slices,labels=labl,explode=0.1,
main="3D Pie Chart Example")

&#91;/sourcecode&#93;
<p style="text-align: center;"><a href="http://statistical-research.com/wp-content/uploads/2012/12/pie3d.likert.png"><img class="aligncenter  wp-image-855" title="3D Pie Chart for Likert Scale" src="http://statistical-research.com/wp-content/uploads/2012/12/pie3d.likert.png" alt="" width="420" height="300" /></a></p>
<strong>Grouped Bar Chart</strong>

This is a nice approach when wanting to look at each group and highlight any particular Likert response option.  Here it is easy to see that in Group 2 the <em>Neutral </em>option is by far the most common response.

par(mfrow=c(1,1))
barplot(gbar, beside=T, col=cm.colors(5), main="Example Bar Chart of Counts by Group",xlab="Group",ylab="Frequency")

legend("topright", names(my.table),
col=cm.colors(5), title="Legend",
lty=1, lwd=15,
inset = .1)



Divided Bar Chart

This isn’t a bad approach and quite similar to the diverging stacked bar chart.  This approach shows the stacked percent for each category.

library(ggplot2)
library(reshape2)

names(sgbar) <- c("group","Strongly Agree","Agree","Neutral","Disagree","Strongly Disagree") mx <- melt(sgbar, id.vars=1) names(mx) <- c("Group","Category","Percent") ggplot(mx, aes(x=Group, y=Percent, fill=Category)) + geom_bar(stat="identity") [/sourcecode]

## Power Analysis and the Probability of Errors

Power analysis is a very useful tool to estimate the statistical power from a study. It effectively allows a researcher to determine the needed sample size in order to obtained the required statistical power. Clients often ask (and rightfully so) what the sample size should be for a proposed project. Sample sizes end up being a delicate balance between the amount of acceptable error, detectable effect size, power, and financial cost. A lot of factors go into this decision. This example will discuss one approach.

In more specific terms power is the probability that a statistical test will reject the null hypothesis when the null hypothesis is truly false. What this means is that when power increases the probability of making a Type II error decreases. The probability of a Type II error is denoted by $\beta$ and power is calculated as $power = 1 - \beta$.

In order to calculate the probability of a Type II error a researcher needs to know a few pieces of information $\mu$, $\sigma^2$, $n$, and $\alpha$ (probability of a Type I error). Normally, if a researcher already knows the population mean ($\mu$) and variance ($\sigma^2$) there is no need to take a sample to estimate them. However, we can set it up so we can look at a range of possible unknown population means and variances to see what the probability of a Type II error is for those values.

The following code shows a basic calculation and the density plot of a Type II error.

$P\left(X>38| \mu,\sigma^2 \right)=P\left(Z>\frac{38-50}{10}\right)=P\left(Z>-1.2\right)=.8849303$

my.means = c(38,50);
my.sd = c(10,10);

X = seq(my.means[1]-my.sd[1]*5,my.means[1]+my.sd[1]*5, length=100);
dX = dnorm(X, mean=my.means[1], sd=my.sd[1]); #True distribution
dX = dX/max(dX);

X.2 = seq(my.means[2]-my.sd[2]*5,my.means[2]+my.sd[2]*5, length=100);
dX.2 = dnorm(X.2, mean=my.means[2], sd=my.sd[2]); #Sampled distribution
dX.2 = dX.2/max(dX.2);

plot(X, dX, type="l", lty=1, lwd=2, xlab="x value",
ylab="Density", main="Comparison of Type II error probability",
xlim=c(min(X,X.2),max(X,X.2)))
lines(X.2,dX.2, lty=2)
abline(v=my.means, lty=c(1,2));
legend("topleft", c("Sample","Type II Error"), lty=c(2,1), col=c("black","black"), lwd=2, title=expression(paste("Type II Error")))

x = (my.means[1]-my.means[2])/my.sd[2];
1-pnorm(x);

[1] 0.8849303



However, there is a more direct way to determine necessary power and the Type II error calculation using R by setting up a range of sample sizes. Often the mean and variance for power analysis are established through a pilot study or other preliminary research on the population of interest.

R has several convenient functions to calculate power for many different types of tests and designs.  For this example power.anova.test will be used for demonstration purposes.  However, power.t.test could have been used as well.

For each of these functions there are varying values that need to be supplied however they generally involve effect size (difference in group means), sample size, alpha level, measure of variability and power.  In this example the within variability is estimated by using the MSE from the ANOVA table and the between variability is estimated from the variance of group means.  Otherwise, three different alpha levels are measured over a range of 50 possible n’s.


set.seed(1234);
x = matrix(NA,nrow=200,ncol=2);
x[,1]= rbind(rep(1,100),rep(2,100))
x[,2]= rbind(rnorm(100,23,20),rnorm(100,7,20));
x=data.frame(x); names(x) = c("group","value");
groupmeans = as.matrix(by(x$value,x$group,mean));
x.aov = summary(aov(x$value ~ as.factor(x$group)))

nlen = 50
withinvar = 470; ##From MSE in ANOVA
raw.pwr = matrix(NA, nrow=nlen, ncol=5)
for(i in 1:nlen){
pwr.i10 = power.anova.test(groups=length(groupmeans), n=1+i,
between.var=var(groupmeans), within.var=withinvar, sig.level=.10)
pwr.i05 = power.anova.test(groups=length(groupmeans), n=1+i,
between.var=var(groupmeans), within.var=withinvar, sig.level=.05)
pwr.i01 = power.anova.test(groups=length(groupmeans), n=1+i,
between.var=var(groupmeans), within.var=withinvar, sig.level=.01)
power10 = pwr.i10$power power05 = pwr.i05$power
power01 = pwr.i01\$power
raw.pwr[i,1] = power10
raw.pwr[i,2] = power05
raw.pwr[i,3] = power01
raw.pwr[i,5] = 1+i
}

plot(raw.pwr[,5], raw.pwr[,1], type="n", ylim=c(0,1),
ylab="Power",xlab="Replicates (sample size per group)",
main=expression(paste("Power Analysis: 2 Tx Groups for ", alpha, "=.10, .05, .01"))
, sub=paste("Within var=",withinvar,"; Between var=", round(var(groupmeans)),2), cex.sub=.75
)
lines(raw.pwr[,5], raw.pwr[,1], type="l", lty=1, lwd=2, col="blue")
lines(raw.pwr[,5], raw.pwr[,2], type="l", lty=2, lwd=2, col="red")
lines(raw.pwr[,5], raw.pwr[,3], type="l", lty=3, lwd=2, col="green")
abline(h=seq(.1:.9, by=.1), col="lightgrey", lty=2); abline(h=.9, col=1, lty=1);
abline(v=c(10,20,30,40), lty=2, col="lightgrey")
abline(v=raw.pwr[,5][round(raw.pwr[,1],2)==.90][1]); text(raw.pwr[,5][round(raw.pwr[,1],2)==.90][1]-2.5, 0, paste("n: ",raw.pwr[,5][round(raw.pwr[,1],2)==.90][1]));
legend("bottomright", c(".10",".05",".01"), lty=c(1,2,3), col=c("blue","red","green"), lwd=2, title=expression(paste(alpha, " levels")))



This graph shows what the power will be at a variety of sample sizes. In this example to obtain a power of 0.90 ($\alpha=0.10$) a sample of size 23 (per group) is needed.  So that will be a total of 46 observations.  It’s then up to the researcher to determine the appropriate sample size based on needed power, desired effect size, $\alpha$ level, and cost.

There is no real fixed standard for power. However, 0.8 and 0.9 are often used. This means that the probability of a Type II error is 0.2 and 0.1, respectively. But it really comes down to whether the researcher is willing to accept a Type I error or a Type II error. For example, it’s probably better to erroneously have a healthy patient return for a follow-up test than it is to tell a sick patient they’re healthy.