Random Sequence of Heads and Tails: For R Users

Posted on October 10, 2013October 10, 2013 by Clint Stevenson

Rick Wicklin on the SAS blog made a post today on how to tell if a sequence of coin flips were random. I figured it was only fair to port the SAS IML code over to R. Just like Rick Wicklin did in his example this is the Wald-Wolfowitz test for randomness. I tried to match his code line-for-line.

flips = matrix(c('H','T','T','H','H','H','T','T','T','T','T','T','T','H','H','H','T','H','T','H','H','H','T','H','H','H','T','H','T','H'))

RunsTest = function(flip.seq){
  u = unique(flip.seq) # unique value (should be two)
  
  d = rep(-1, nrow(flip.seq)*ncol(flip.seq)) # recode as vector of -1, +1
  d[flip.seq==u[1]] = 1
  
  n = sum(d > 0) # count +1's
  m = sum(d < 0) # count -1's
  
  dif = c(ifelse(d&#91;1&#93; < 0, 2, -2), diff( sign(d) )) # take the lag and find differences
  
  R = sum(dif==2 | dif==-2) # count up the number of runs
  
  ww.mu = 2*n*m / (n+m) + 1 # get the mean
  ww.var = (ww.mu-1)*(ww.mu-2)/(n+m-1) # get the variance
  sigma = sqrt(ww.var) # standard deviation
  
  # compute test statistics
  if((n+m) > 50){
    Z  = (R-ww.mu) / sigma
  } else if ((R-ww.mu) < 0){
    Z = (R-ww.mu+0.5) / sigma
  } else {
    Z = (R-ww.mu-0.5)/sigma
  }
  
  
  pval = 2*(1-pnorm(abs(Z))) # compute a two-sided p-value 
  
  ret.val = list(Z=Z, p.value=pval)
return(ret.val)
}

runs.test = RunsTest(flips)
runs.test


> runs.test
$Z
[1] -0.1617764

$p.value
[1] 0.8714819

6 replies on “Random Sequence of Heads and Tails: For R Users”

Rick Wicklin says:

October 11, 2013 at 8:13 am

Nice. I often convert programs from R to SAS/IML. The languages are similar enough that most computations port with minimal effort, as you’ve shown here.

By typing
runs Test in R
into a search engine, I found at least two R packages that support the runs test. Searching for “Wald-Wolfowitz test” might give more. Best wishes!

Reply
Carl Witthoft says:

October 11, 2013 at 9:35 am

Well… Define “random.” You guys are checking that it fits a binomial, but you failed to use the critical word “fair” to define your coin. Ask any mathematician — he’ll tell you that an unfair coin can do anything; similarly tests with actual coins show a bias because they are not perfectly symmetric (center of mass and all that). So “true random” might require fitting to the expected outcome of a Not_Fair coin.
(BTW, my captcha was (?)*9 = 72, which based on HHGTTG should be more like 10.3 than 8 🙂 )

Reply
Owe says:

October 12, 2013 at 8:12 am

I run into a little problem: After changing ret.val = pval and replicating i get a nearly uniform distribution of p-values. What went wrong (assuming that the sampler is random)

loops = 1000
mcRes = rep(0, loops)
flips= matrix(sample(c(“H”,”T”), 1000*loops, T), ncol=1000)
for(iter in 1:loops)
{
mcRes[iter] = RunsTest(matrix(flips[iter,]))
}

summary(mcRes)

Reply
Owe says:

October 12, 2013 at 8:13 am

ps: summary(mcRes)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0007974 0.2688000 0.5321000 0.5145000 0.7565000 0.9999000

Reply
Dale says:

October 14, 2013 at 2:04 am

This is screaming out for an application of the “rle” (run length encoding) function. Especially so considering the ‘number of runs’ is explicitly mentioned. From a quick search, it doesn’t look like SAS IML has an “rle” function, which might explain why it wasn’t used in the original code.

Reply
1. Carl Witthoft says:
  
  October 14, 2013 at 9:57 am
  
  Yes! My favorite R-function! So much so that (shameless plug) I published a version which finds linear sequences as well as runs. See cgwtools::seqle at CRAN
  
  Reply