# Random Sequence of Heads and Tails: For R Users

By | October 10, 2013

Rick Wicklin on the SAS blog made a post today on how to tell if a sequence of coin flips were random.  I figured it was only fair to port the SAS IML code over to R.  Just like Rick Wicklin did in his example this is the Wald-Wolfowitz test for randomness.  I tried to match his code line-for-line.

```flips = matrix(c('H','T','T','H','H','H','T','T','T','T','T','T','T','H','H','H','T','H','T','H','H','H','T','H','H','H','T','H','T','H'))

RunsTest = function(flip.seq){
u = unique(flip.seq) # unique value (should be two)

d = rep(-1, nrow(flip.seq)*ncol(flip.seq)) # recode as vector of -1, +1
d[flip.seq==u] = 1

n = sum(d > 0) # count +1's
m = sum(d < 0) # count -1's

dif = c(ifelse(d < 0, 2, -2), diff( sign(d) )) # take the lag and find differences

R = sum(dif==2 | dif==-2) # count up the number of runs

ww.mu = 2*n*m / (n+m) + 1 # get the mean
ww.var = (ww.mu-1)*(ww.mu-2)/(n+m-1) # get the variance
sigma = sqrt(ww.var) # standard deviation

# compute test statistics
if((n+m) > 50){
Z  = (R-ww.mu) / sigma
} else if ((R-ww.mu) < 0){
Z = (R-ww.mu+0.5) / sigma
} else {
Z = (R-ww.mu-0.5)/sigma
}

pval = 2*(1-pnorm(abs(Z))) # compute a two-sided p-value

ret.val = list(Z=Z, p.value=pval)
return(ret.val)
}

runs.test = RunsTest(flips)
runs.test

> runs.test
\$Z
 -0.1617764

\$p.value
 0.8714819

```
Category: Uncategorized

## 6 thoughts on “Random Sequence of Heads and Tails: For R Users”

1. Rick Wicklin

Nice. I often convert programs from R to SAS/IML. The languages are similar enough that most computations port with minimal effort, as you’ve shown here.

By typing
runs Test in R
into a search engine, I found at least two R packages that support the runs test. Searching for “Wald-Wolfowitz test” might give more. Best wishes!

2. Carl Witthoft

Well… Define “random.” You guys are checking that it fits a binomial, but you failed to use the critical word “fair” to define your coin. Ask any mathematician — he’ll tell you that an unfair coin can do anything; similarly tests with actual coins show a bias because they are not perfectly symmetric (center of mass and all that). So “true random” might require fitting to the expected outcome of a Not_Fair coin.
(BTW, my captcha was (?)*9 = 72, which based on HHGTTG should be more like 10.3 than 8 🙂 )

3. Owe

I run into a little problem: After changing ret.val = pval and replicating i get a nearly uniform distribution of p-values. What went wrong (assuming that the sampler is random)

loops = 1000
mcRes = rep(0, loops)
flips= matrix(sample(c(“H”,”T”), 1000*loops, T), ncol=1000)
for(iter in 1:loops)
{
mcRes[iter] = RunsTest(matrix(flips[iter,]))
}

summary(mcRes)

4. Owe

ps: summary(mcRes)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0007974 0.2688000 0.5321000 0.5145000 0.7565000 0.9999000

5. Dale

This is screaming out for an application of the “rle” (run length encoding) function. Especially so considering the ‘number of runs’ is explicitly mentioned. From a quick search, it doesn’t look like SAS IML has an “rle” function, which might explain why it wasn’t used in the original code.

1. Carl Witthoft

Yes! My favorite R-function! So much so that (shameless plug) I published a version which finds linear sequences as well as runs. See cgwtools::seqle at CRAN