Michiel Van Bel wrote:
Michiel Van Bel wrote:
  
Frederik Delaere wrote:
  
    
Michiel Van Bel wrote:
  
    
      
okay, my statistics is a bit rusty (and I do not really know how to 
google for it), so perhaps you guys might have a hint:

1) say you have X objects (in a linear row)
2) Y of those X objects (Y <= X naturally) have some kind of special status
3) Z is a number between 0 and Y
4) What are the odds that Z objects with the special status are located 
next to each other


For example:
X=50,Y=20,Z=10
You have 50 pencils, and 20 of them are red, 30 of them are blue.
What are the odds that (if you lay those 50 pencils down randomly) there 
are 10 red pencils next to each other?

Anyone any hints?


  
    
      
        
ask in the forums on unibet.com ?

  
    
      
okay, if I follow Marijn's reasoning:
I get that the chance is approx. (N-Z) * (Y!) / ((N-Z)!)
This should be great, except for the fact of course that N and Y are 
pretty big numbers (N : 30000 genes, Y : 10 to 10000 GO terms), and Z is 
pretty small (10 or so).
30000! is a rather largish number actually :-/

  
    

Okay, Elisabeth came up with a very good remark:
the chance that 10 red pencils are next to each other, is the same as 
any other random combination of red and blue pencils.
  
Eh?!
The clue is, ofcourse, that the 10 pencils can be luying at any given spot of the 50 free positions.
I think you need to calculate it like that.
First calculate all possibilities of 10 spots next to eachother. Assume these 30 are red.
Then fill all other spots with the remaining pencils.

Now you only have to calculate somehow the overlap between the solutions.
Eg. First 10 are red and last 10 are red, that configuration will be counted twice.

I dunno.
Sofie
...

*grmbl*
And with numbers as big as these, it is indeed probably better to just 
take the background frequency Y/X
Although it still doesn't really "feel" right to me... but that is 
always the case with statistics

see this for example:   http://en.wikipedia.org/wiki/Monty_Hall_problem
Still makes no real sense to me :-(


  

-- 
Sofie Van Landeghem
PhD Student
VIB Department of Plant Systems Biology, Ghent University
Bioinformatics and Evolutionary Genomics
Technologiepark 927, 9052 Gent, BELGIUM
Tel: +32 (0)9 331 36 95                        fax:+32 (0)9 3313809
Website: http://bioinformatics.psb.ugent.be