Tuesday, September 4, 2012

What is the Probability of Getting (k) Heads in a Row for (n) Consecutive Tosses?

I asked myself a fun question after reading a post on QuantNet. What are the odds of getting two, four, or six heads after five, ten, or a hundred consecutive tosses of a fair coin? It seemed like a fun high school leveled math problem and with some quick python I was able to generate a pretty graph to answer this question.


My approach to this problem is actually quite brutish because of the tricky counting. You cannot assume that the probability at each step in the sequence is identical. Even though the inherent probability of the fair coin is still 0.5, the sequences are not completely independent due to causality. Here is a quick demonstration for counting two heads out of five tosses to illustrate this point.

Possible Outcomes

Toss 1 H T
Toss 2 HH
TH
HT
TT
Toss 3 HHH
HTH
THH
TTH
HHT
HTT
THT
TTT
Toss 4 HHHH
HHTH
HTHH
HTTH
THHH
THTH
TTHH
TTTH
HHHT
HHTT
HTHT
HTTT
THHT
THTT
TTHT
TTTT
Toss 5 HHHHH
HHHTH
HTHHH
HHHTT
HHTTH
HTHTH
THHHT
THTHH
HHTTT
HTTHT
THHTT
THTTH
TTHTH
HTTTT
TTHTT
TTTTH
HHHHT
HHTHH
THHHH
HHTHT
HTHHT
HTTHH
THHTH
TTHHH
HTHTT
HTTTH
THTHT
TTHHT
TTTHH
THTTT
TTTHT
TTTTT

The strikethroughs are the outcomes that have been eliminated because they have previously yielded to heads in a row. The highlighted sequences are the outcomes that can successfully produce two heads on that toss. Using the example count for a sequence of two heads out of three tosses, on the third toss there is one possibility of producing a sequence with two consecutive heads (instead of three) and there are six possible outcomes (instead of eight). The differences derives from the first possible successful sequencing of two heads in a row from the second toss.

This counting was performed for the sequences of three and four heads in a row. The results are summarized in the tables below, along with the sequence of two heads in a row.

Two Heads in a Row
Toss 1 Toss 2 Toss 3 Toss 4 Toss 5 Toss 6
Successes 0 1 1 2 3 5
Failures 2 3 5 8 13 21
Outcomes 2 4 6 10 16 26

Three Heads in a Row
Toss 1 Toss 2 Toss 3 Toss 4 Toss 5 Toss 6 Toss 7
Successes 0 0 1 1 2 4 7
Failures 2 4 7 13 24 44 81
Outcomes 2 4 8 14 26 48 88

Four Heads in a Row
Toss 1 Toss 2 Toss 3 Toss 4 Toss 5 Toss 6 Toss 7 Toss 8
Successes 0 0 0 1 1 2 4 8
Failures 2 4 8 15 29 56 108 208
Outcomes 2 4 8 16 30 58 112 216

Here is the elegance that I miss about high school mathematics. You have probably already noticed that the successes, the failures, and the outcomes are expressible as generalized fibonacci numbers. The successes for the sequence involving two heads in a row are the well-known Fibonacci number. The successes for the sequence involving three heads in a row are the lesser known tribonacci numbers. The successes for the sequence involving three heads in a row are the almost unknown tetranacci numbers. These high order sequences can be recursively expressed with n as number of tosses, k as the number of heads in a row, and for k > 2 as:

\[Fib_{k}=F_{k}^{(n)}=\sum_{i=1}^{n} F_{k-i}^{(n)}\]
Then to calculate the probability of success for k heads in a row at the nth toss would be:

\[probability_n=1-\frac{Fib_{k}(n+k)}{2 \times Fib_{k}(n-1+k)}\]
To calculate the total probability of seeing k heads in a row after n tosses would be:

\[probability=1-\frac{Fib_{k}(1+k)}{2 \times Fib_{k}(1-1+k)} \times \frac{Fib_{k}(2+k)}{2 \times Fib_{k}(2-1+k)} \times \frac{Fib_{k}(3+k)}{2 \times Fib_{k}(3-1+k)} \ldots \times \frac{Fib_{k}(n+k)}{2 \times Fib_{k}(n-1+k)}\]
This result is readily translated into python to produce the pretty graph at the beginning of the post. The code is presented below:
from decimal import Decimal # function for generalized fibonacci sequences def Fib(n,k): if n < k-1: return 0 elif n == k-1: return 1 else: f =[0] * (n+1) for i in xrange(k-1): f[i] = 0 f[k-1] = 1 for i in xrange(k,n+1): a = 0 for j in xrange(k+1): a += f[i-j] f[i] = a return f[n] # function to calculate failures/outcomes def ProbHnFail(n,k): b = 2*Fib(n-1+k,k) # number of outcomes if n < k: a = b else: a = Fib(n+k,k) # number of failures if a == 0: a = b p = (Decimal(a)/Decimal(b)) return p # calculates and plots probabilities import numpy as np Tosses = 100 t1 = np.arange(1, Tosses+1, 1) H2=[] for x in t1: y = ProbHnFail(x,2) H2.append(y) a = 1-np.cumprod(H2) H3=[] for x in t1: y = ProbHnFail(x,4) H3.append(y) b = 1-np.cumprod(H3) H4=[] for x in t1: y = ProbHnFail(x,6) H4.append(y) c = 1-np.cumprod(H4) import matplotlib.pyplot as plt fig = plt.figure() ax = fig.add_subplot(111) ax.plot(t1,a, t1,b, t1,c) ax.set_ylim([0,1.01]) ax.grid(True) ax.legend(('HH','HHHH','HHHHHH'), 'best') ax.set_xlabel('Number of Consecutive Tosses') ax.set_ylabel('Probability of Appearance') ax.set_title('Probability after n-Tosses') plt.show()
Happy flipping!

No comments:

Post a Comment