Michiel Van Bel wrote:
Elisabeth Wischnitzki wrote:
  
Michiel Van Bel wrote:
  
    
Michiel Van Bel wrote:
  
    
      
Frederik Delaere wrote:
  
    
      
        
Michiel Van Bel wrote:
  
    
      
        
          
okay, my statistics is a bit rusty (and I do not really know how to 
google for it), so perhaps you guys might have a hint:

1) say you have X objects (in a linear row)
2) Y of those X objects (Y <= X naturally) have some kind of special status
3) Z is a number between 0 and Y
4) What are the odds that Z objects with the special status are located 
next to each other


For example:
X=50,Y=20,Z=10
You have 50 pencils, and 20 of them are red, 30 of them are blue.
What are the odds that (if you lay those 50 pencils down randomly) there 
are 10 red pencils next to each other?

Anyone any hints?


  
    
      
        
          
            
ask in the forums on unibet.com ?

  
    
      
        
          
okay, if I follow Marijn's reasoning:
I get that the chance is approx. (N-Z) * (Y!) / ((N-Z)!)
This should be great, except for the fact of course that N and Y are 
pretty big numbers (N : 30000 genes, Y : 10 to 10000 GO terms), and Z is 
pretty small (10 or so).
30000! is a rather largish number actually :-/

  
    
      
        
Okay, Elisabeth came up with a very good remark:
the chance that 10 red pencils are next to each other, is the same as 
any other random combination of red and blue pencils.
...

*grmbl*
And with numbers as big as these, it is indeed probably better to just 
take the background frequency Y/X
Although it still doesn't really "feel" right to me... but that is 
always the case with statistics

see this for example:   http://en.wikipedia.org/wiki/Monty_Hall_problem
Still makes no real sense to me :-(


  
    
      
Btw:
(N-Z)! is not 50*49*48...etc it is you example 40*39*...*2*1

N! / (N-Z)! is what you are looking for..

  
    

But...there is no Y in that formula!
*starts weeping silently under his desk*

Screw it, I'll create my own statistic theories, with blackjack and 
hookers. In fact, forget about the statistics!


  
Right, just brute force over the weekend. Generate all the possibilities and then just count them :D

Sofie


-- 
Sofie Van Landeghem
PhD Student
VIB Department of Plant Systems Biology, Ghent University
Bioinformatics and Evolutionary Genomics
Technologiepark 927, 9052 Gent, BELGIUM
Tel: +32 (0)9 331 36 95                        fax:+32 (0)9 3313809
Website: http://bioinformatics.psb.ugent.be