Tuesday, August 19, 2008

HTH or HTT?

Assuming we've got a long random string composed of the letters H and T, and only those two letters, which of the patterns "HTH" or "HTT" (overlap not allowed) do you think will occur more frequently?

For example, if overlap is not allowed, in the string "HTHTHTTH", both "HTH" and "HTT" occur once. However, if we allow for overlap, then "HTH" occurs twice and "HTT" once.

So, the question is, which of the patterns "HTH" or "HTT" (without overlap) do you think will occur more frequently? Will "HTH" occur at a higher, the same or lower frequency than "HTT".

You must give an answer.

To find out the answer run the Simulation application written in Java.

This question is an adaptation of the question asked by Peter Donnelly in his TED presentation on "How juries are fooled by statistics."

6 Comments:

OpenID stephanetavera said...

Some ruby guru will probably make a fool of the following code, but here it is :

class String
def count_occur(pattern)
self.scan(pattern).size
end
end

len = 1e5

10.times do
chain=Array.new(len){rand>=0.5?'H': 'T'}.to_s
puts [/HTH/, /HTT/].map{|pat|
"#{pat.inspect} -> #{chain.count_occur(pat)}" }.join(',')
end

I vote for HTT :

/HTH/ -> 9726,/HTT/ -> 12628
/HTH/ -> 10041,/HTT/ -> 12462
/HTH/ -> 10125,/HTT/ -> 12323
/HTH/ -> 9988,/HTT/ -> 12534
/HTH/ -> 9857,/HTT/ -> 12592
/HTH/ -> 10026,/HTT/ -> 12499
/HTH/ -> 10183,/HTT/ -> 12418
/HTH/ -> 10079,/HTT/ -> 12364
/HTH/ -> 10093,/HTT/ -> 12402
/HTH/ -> 9987,/HTT/ -> 12531

11:15 AM  
Blogger Ceki said...

And when overlap is allowed?

12:40 PM  
OpenID stephanetavera said...

class String
def count_occur(expression, overlap)
expression = '(?=(' + expression + '))' if overlap
pattern = Regexp.new(expression)
self.scan(pattern).size
end
end

len = 1e5

def show(chain, expressions, overlap)
expressions.map{|expr|
"#{expr} -> #{chain.count_occur(expr, overlap)}" }.join(',')
end

10.times do
chain=Array.new(len){rand>=0.5?'H': 'T'}.to_s
expressions = ['HTH', 'HTT']
puts "no_overlap = #{show(chain, expressions, false)}"
puts "overlap = #{show(chain, expressions, true)}"
end

1:49 PM  
Blogger Ceki said...

Ruby is sure more succinct than Java which, by common sense, should contribute to better legibility. On the other hand, I must say that I did not find the Ruby version more legible, because 1) of bad indentation 2) I don't know Ruby, none of which is of course Ruby's fault.

2:09 PM  
OpenID stephanetavera said...

The fact that ruby code is shorter than java code is only one of the reason I enjoy this language so much.
You shouldn't forget dynamic typing, meta-programming possibilities, open classes, ...
Yes, brevity can lead to ugliness. (From my small experience, ruby encourages you to write readable code in many ways).
For 2), it's your fault ;-) : Give it a try !

5:33 PM  
Blogger Ceki said...

Most of the problems that really interest me have to do with data structures, and Java is good enough a language for their implementation. So, I don't feel the need to try other languages.

6:35 PM  

Post a Comment

<< Home