
Hey people, I've some problems (execution-time related) with some of my queries, involving go-labels. For those not familiar with go-labels: these labels form a directed graph (which can be cyclic), so we have child-parent relationships between the various go-labels. Now, we already have an extra table parents_go (with 2 columns: child_go, parent_go). For each parent_go, all the children (and grandchildren, and great-grandchildren,...) are included as child_go, so we do not have to travel in the graph anymore. In this case, GO:0004402 has 6 children go-labels. This query attempts to find the count of genes (with a certain go-label, in this case GO:0004402), sorted by their evidence tags. Now this is my query: SELECT COUNT(DISTINCT(`gene_go`.`gene_id`)) as 'count',`gene_go`.`evidence` FROM `gene_go` WHERE (`gene_go`.`go`='GO:0004402' OR `gene_go`.`go` IN (SELECT `child_go` FROM `go_parents` WHERE `parent_go`='GO:0004402')) GROUP BY `gene_go`.`evidence` Problem is that it takes a while friggin 27 seconds to perform (which is pretty lame for a website, even with ajax). Okay I thought, lets be smart and first get all the child_go labels from GO:0004402, then loop over them and execute a query each time, counting the genes per evidence tag only for that specific child_go label. This results in a whooping 17 ms execution time... but the results are not the same (apparently some idiots annotated some genes with both the label of a child_go and of a parent_go label). So my questions are: a) Why is the execution time for the first query so goddamn slow compared to the php-loop solution. b) Any idea how to fix the problem? I'd rather not retrieve all the genes and start manually sorting and counting them. Michiel -- ================================================================== Michiel Van Bel PhD student Tel:+32 (0)9 331 36 95 fax:+32 (0)9 3313809 VIB Department of Plant Systems Biology, Ghent University Technologiepark 927, 9052 Gent, BELGIUM mibel@psb.ugent.be http://www.psb.ugent.be ==================================================================