is da een prob van buffering van de data stream da hij maar 601Kb krijgt ipv 728???

L.

-------- Original Message --------
Subject: Re: cytochrome P450s in Ectocarpus
Date: Tue, 23 Oct 2007 10:09:29 -0500
From: David Nelson <drnelson1@gmail.com>
To: lieven sterck <lieven.sterck@psb.ugent.be>
References: <d862fa170710201348oba00fb2i5367e7ce2b09e4b@mail.gmail.com> <471C6F29.9000504@psb.ugent.be> <d862fa170710221017i4ef554c7x3b4840c2cc7ae337@mail.gmail.com> <d862fa170710221211g3d88a30fna570812c65112ca4@mail.gmail.com> <13689.157.193.202.129.1193094062.squirrel@webmail.psb.ugent.be> <d862fa170710221811g6819eed7lbaa4969e338fbca8@mail.gmail.com> <d862fa170710230750l60cd4d5fj8b138c24f51a70e8@mail.gmail.com> <471E0AEC.2080305@psb.ugent.be>


Dear Lieven,

Thanks so much.  I doubt most of these P450s will have ESTs, but it would sure make my life easier.

I noticed a problem in recovering your sequences from the blast pages.  When I clicked on the F next to the sequence name I got a display of the nucleotide sequence.  There was a button that said save as FASTA.  When I did this and opened the file on my computer it was missing quite a lot of sequence.  Of the contig_63 that was 728,000 bp I only got 601,000 bp.

However, if I selected the sequence off the display page and pasted into a Word document I got all 728,000bp.  So there is some problem with the save as FASTA command.

You might want to do a couple of checks.

David

On 10/23/07, lieven sterck <lieven.sterck@psb.ugent.be> wrote:
Hi David,

I just received the cleaned ESTs a couple of days ago.
I will make this set also available on the blast page.


regards,
lieven


David Nelson wrote:
Dear Lieven,

How can I Blast ESTs for Ectocarpus?

David

On 10/22/07, David Nelson <drnelson1@gmail.com > wrote:
Dear Lieven,

Thanks for the tip.  I had been clicking on the name next to the F.  I thought they were both the same, I did not try the F.  It worked.


David

On 10/22/07, Lieven Sterck <lieven.sterck@psb.ugent.be> wrote:
Hi David,

I could send you the scaffolds.
But you should be able to extract them yourself: on the blast result page
click on the 'F' left of the sequence name (next to the '>' sign) this
will lead you to the actual sequence of the hit.

I will get you an invitation to the annotation site first thing in the
morning.

don't hesitate to contact me if something is unclear!!
regards,
lieven


> Dear Lieven,
>
> There are only 10 contigs with P450 sequences.
> Some are pretty short weak matches and may be accidental.
>> F sctg_6 CYP51C1
>> F sctg_63 CYP97E3 and CYP97E4
>> F sctg_25 CYP97F-like seq
>> F sctg_362 most like CYP5021A1
>> F sctg_1 bacterial chromosome with 14 P450s, all assembled
>> F sctg_10 C-helix and I-helix parts
>> F sctg_471 2 sequences only partially assembled
>> F sctg_60 2 sequences only have the PERF motif (accidental?)
>> F sctg_193 only the I-helix motif
>> F sctg_81 poor match may be accidental (heme signature region)
>
> The current version of the blast server will not return any DNA sequence
> to
> me.  I am not given permission to see the genome yet, so could you please
> send me 9 of the contigs above so I can continue to assemble these P450
> genes.  I do not need the sctg_1 since I have already completed those 14
> bacterial P450s.
>
> If the files are too large,
> I only need the parts of them shown below
>
>> F sctg_6 1,170,000-1,200,000
>> F sctg_10  780,000-820,000
>> F sctg_25 950,000-970,000
>> F sctg_60 400,000-460,000
>> F sctg_63 330,000-370,000
>> F sctg_81 170,000-220,000
>> F sctg_193 150,000-200,000
>> F sctg_362  70,000-90,000
>> F sctg_471 30,000-80,000
>
> Thanks.
>
> David Nelson
>
>
> On 10/22/07, David Nelson <drnelson1@gmail.com > wrote:
>>
>> Dear All,
>>
>> Can you make the pull down choices for the expect value in the blast
>> server go up to 100 rather than stopping at 10?  This may be helpful for
>> finding distantly related exons.
>>
>> David
>>
>> On 10/22/07, lieven sterck <lieven.sterck@psb.ugent.be> wrote:
>> >
>> >  Hi David,
>> >
>> > That's correct! Scaffold sctg_1 is indeed from a bacterial symbiont
>> > found associated with (or within) ectocarpus.
>> > There are some more scaffolds from this organism, a list:
>> > sctg_1
>> > sctg_1222
>> > sctg_1301
>> > sctg_1320
>> > sctg_1403
>> > sctg_1415
>> > sctg_1432
>> > sctg_1769
>> > sctg_1876
>> > sctg_397
>> > sctg_627
>> > sctg_822
>> > sctg_853
>> > sctg_870
>> > sctg_886
>> > sctg_893
>> >
>> > these scaffolds have recently been removed from the official assembly.
>> > GenoScope (responsible for the sequencing) is finishing and analyzing
>> > this bacterial symbiont.
>> >
>> > best regards,
>> > lieven
>> >
>> >
>> > David Nelson wrote:
>> >
>> > Dear all,
>> >
>> > I have begun looking over the P450s in Ectocarpus.
>> > The biggest surprise was sctg_1 which had 14 P450s but they seem to be
>> > bacterial.
>> >
>> > Here are the 14 P450s with their best matches.  I compared them to
>> 3305
>> > bacterial sequences from the Global Ocean Sequencing Project at the J
>> Craig
>> > Venter Institute.  Some were a better match to those seawater
>> sequences than
>> > to named P450s.
>> >
>> > Since the whole contig of 3.4 Mb seems to be bacterial and often like
>> > Caulobacter, I searched Genbank for a rRNA from Caulobacter.  I found
>> one
>> > from a
>> >
>> > Caulobacter endosymbiont of Tetranychus urticae
>> > AY753176
>> > ribosomal RNA gene, partial sequence.
>> >
>> > I used this for a blastn search of Ectocarpus.
>> > I found a 90% match on the same contig as the 14 P450s (see below)
>> > There is a Caulobacter-like endosymbiont or contaminant in the algae
>> > genome data.
>> >
>> > I blasted the Ectocarpus bacterial rRNA seq against Genbank and found
>> > a Sargasso Sea bacterioplankton as the best match
>> >
>> > >F sctg_1
>> > 3017346 aaagatttatcgcccctggatgggcccgcgt tggattagctagttggtggggtaatggcc
>> > 3017405
>> > 3017406 taccaaggcgacgatccatagctggtctgagaggatgatcagccacactggaactgagac
>> > 3017465
>> > 3017466 acggtccagactcctacgggaggcagcagtggggaatattggacaatgggcgcaagcctg
>> > 3017525
>> > 3017526 atccagccatgccgcgtgagtgatgaaggccctagggttgtaaaactctttcagtggtga
>> > 3017585
>> > 3017586 agataatgacggtaaccacagaagaagctccggctaactccgtgccagcagccgcggtaa
>> > 3017645
>> > 3017646 tacggagggagctagcgttgttcggaattactgggcgtaaagcgcacgtaggcggtctat
>> > 3017705
>> > 3017706 aaagttgggggtgaaatcccggagctcaactccggaactgcct 3017748
>> >
>> > only 6 nucleotide diffs to
>> > >gb|AY162106.1|  Alpha proteobacterium GMD21A06 small subunit
>> ribosomal
>> > RNA gene,
>> > partial sequence
>> > Length=1326
>> >
>> > isolation_source="Sargasso Sea bacterioplankton
>> >
>> >  Score =  710 bits (384),  Expect = 0.0
>> >  Identities = 397/403 (98%), Gaps = 1/403 (0%)
>> >  Strand=Plus/Plus
>> >
>> > Query  1
>> > AAAGATTTATCGCCCCTGGATGGGCCCGCGTTGGATTAGCTAGTTGGTGGGGTAATGGCC  60
>> >             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> > Sbjct  174
>> > AAAGATTTATCGCCCCTGGATGGGCCCGCGTTGGATTAGCTAGTTGGTGGGGTAATGGCC  233
>> >
>> > Query  61
>> > TACCAAGGCGACGATCCATAGCTGGTCTGAGAGGATGATCAGCCACACTGGAACTGAGAC  120
>> >             ||||||||||||||||||||||||||||||||||||||||||||||||||| ||||
>> |||
>> > Sbjct  234
>> > TACCAAGGCGACGATCCATAGCTGGTCTGAGAGGATGATCAGCCACACTGGGACTGTGAC  293
>> >
>> > Query  121
>> > ACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGCGCAAGCCTG  180
>> >             |||| |||||||||||||||||||||||||||||||||||||||||||||||||||
>> |||
>> > Sbjct  294
>> > ACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGCGCAAG-CTG  352
>> >
>> > Query  181
>> > ATCCAGCCATGCCGCGTGAGTGATGAAGGCCCTAGGGTTGTAAAACTCTTTCAGTGGTGA  240
>> >             ||||||||||||||||||||||||||||||||||||||||||||
>> |||||||||||||||
>> > Sbjct  353
>> > ATCCAGCCATGCCGCGTGAGTGATGAAGGCCCTAGGGTTGTAAAGCTCTTTCAGTGGTGA  412
>> >
>> > Query  241
>> > AGATAATGACGGTAACCACAGAAGAAGCTCCGGCTAACTCCGTGCCAGCAGCCGCGGTAA  300
>> >             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> > Sbjct  413
>> > AGATAATGACGGTAACCACAGAAGAAGCTCCGGCTAACTCCGTGCCAGCAGCCGCGGTAA  472
>> >
>> > Query  301
>> > TACGGAGGGAGCTAGCGTTGTTCGGAATTACTGGGCGTAAAGCGCACGTAGGCGGTCTAT  360
>> >             |||||||||||||||||||||||||||||||||||||||||||||
>> ||||||||||||||
>> > Sbjct  473
>> > TACGGAGGGAGCTAGCGTTGTTCGGAATTACTGGGCGTAAAGCGCGCGTAGGCGGTCTAT  532
>> >
>> > Query  361  AAAGTTGGGGGTGAAATCCCGGAGCTCAACTCCGGAACTGCCT  403
>> >             |||||||||||||||||||||||||||||||||||||||||||
>> > Sbjct  533  AAAGTTGGGGGTGAAATCCCGGAGCTCAACTCCGGAACTGCCT  575
>> >
>> >
>> > > F sctg_1   (14 P450s from a bacterial genome in the Ectocarpus data)
>> >           Length = 3415905
>> >
>> > >67% to CYP108B1 Caulobacter crescentus AE005918
>> > 158936 FAKLRKDTPLGVAGPQGFEPFWVVTKHKDIVEVERNNEVFHNGDKSTTLVDANTDTA
>> 158766
>> > 158765 VREMMGGSPHLIRSLVQMDNPDHKNYRGITAANFMPQELKALEVQVRKIAKSFVDHMEEL
>> > 158586
>> > 158585 GRKNDGRCDFAKDVAFLYPLHVIMELLGVPQSDEPKMLKLTQELFGAADPELNRTGKE
>> 158412
>> >
>> > 158411 RDDPKEALAALSGTVAEFVEYFTAVTEDRRKTPRADIASVIANGKVNGEAIGIFEAMGYY
>> > 158232
>> > 158231 IIVATAGHDTTSGTTAGTMWELAKDRQKFLQVKNDPKLIPLLVEESIRWVTPVKHFMRSA
>> > 158052
>> > 158051 TQDTVLGGQEIKKGDWMMLCYQSGNRDEDVFDDPFEFKVNRQPNRHIAFGHGAHVCLGQH
>> > 157872
>> > 157871 LARMEMRALWEELLPRLDSVELDGEPTRMLANFVCGPKSVPIK 157743
>> >
>> > >59% to CYP153A2 Caulobacter crescentus CB15 GenPept AAK22050
>> > 1458133 LFERLRNEDPVHFFEHEEFGRFWSVTRHADIMSIDTNHQQFSSEPSIFLGNTNSDEDENF
>> > 1458312
>> > 1458313 NPATFIAMDPPKHDAQRNAVNPAVAPPALRDLEPLIRQRVSAVLDSLPIGETFNWVDLVS
>> > 1458492
>> > 1458493 IEITTQMLATLFDFPFEDRYMLTRWSDMTTANPETLAAMGLTIEDRRNAMYECLEIFG
>> > 1458666
>> > 1458667 GLYAERAQLPPANDFISLMAHNEDMKNLDPMNLLGNLVLLIVGGNDTTRNSMSGGVLAL
>> > 1458843
>> > 1458844 HENPAEFAKLKADPSIIPNMVSEIIRWQTPLAYMRRTANEDLEFRGKQIKQGDRIMMWYV
>> > 1459023
>> > 1459024 SGNRDERAIERPNEFLIDRENARRHLSFGFGIHRCMGNRVGEMQVRILWEEILKRFDRVE
>> > 1459203
>> > 1459204 VVGKPARTLSNFVMGFTELPVRL 1459272
>> >
>> > >50% to CYP108B1 Caulobacter crescentus AE005918
>> > 1535639 IFTTLRQDNPLACVEVPGYDPHWMVTKYSDVKEITRQDNLFHSGDRPKILHSQAGEA
>> > 1535809
>> > 1535810 LARSFTGGSPNLFQSLVQLDPPEHTAYRNVLQGEFMPGGIAKMKENVAKTAQEFVDQMAS
>> > 1535989
>> > 1535990 LAPTCDFADDVAMNYPLQVVLDIVGVPREHHPKMLQLTQWLFSYADPDLKRPGSDIT
>> > 1536160
>> > 1536161 DPEEIIKTWNVVFTQFHEFFMPLVEARRANPKEDIASLIANAKINGESMEERKMISYFG
>> > 1536337
>> > 1536338 ILATAGHDTTSATTALGMKMLAENPDMLARLKEKPDLIPSFVEECIRWGSPVQHFIRSAT
>> > 1536517
>> > 1536518 EDYVLRGQTIRKGDLLYISYLSANRDEEEFDDPFTFKMDRAPNRHVGFGFGGHVCLGQH
>> > 1536694
>> > 1536695 LARLEIRTIWQTLLPRLTEVELTGPVKFTESEFVCGPKSVPIR 1536823
>> >
>> > >41% to CYP107AA1 Bradyrhizobium japonicum USDA 110 GenPept BAC51802
>> > 56% to seawater bacterial sequence JCVI_PEP_1096696260773
>> > 2052084 PYPAMNQLREKDPVNETPVGPWRISRHADVVDVFRNAPTSQTLADGSSPNMDDQDRRGS
>> > 2052260
>> > 2052261 FRDFMLNMDGPEHARLRRLVLGAFTPKALKHIEGEIDRVVDEAMHTALKQGGMEVVEDFA
>> > 2052440
>> > 2052441 LRIPSRMICRIMGLPEEDIDQFTVWTAARTNAFFARFLPEDVVEHTRQAGEQMAD
>> 2052605
>> > 2052606 YFEAQIKLRRANPREDLLTNLIQSEEKGDRLGDVELAIQAIGLLIAGFETTIGLIGNGTK
>> > 2052785
>> > 2052786 ALIENPDQAELLKQNPDLAKNTVEECLRYDTPVLFNWRVLTEPYEVGGKTLPENAVLWMM
>> > 2052965
>> > 2052966 LGAANHDPRVHDDPDTMDITRQGISHASFGGGAHTCLGNQLARMEASRAFHAFVS
>> 2053130
>> >
>> > >JCVI_PEP_1096696260773   /source_dna_id=JCVI_ORF_1096696260772
>> > /offset=0
>> >             /translation_table=11 /length=353 /full_length=353
>> >             Length = 353
>> >
>> >  Score = 956 (336.5 bits), Expect = 3.3e-98, P = 3.3e-98
>> >  Identities = 183/326 (56%), Positives = 227/326 (69%)
>> >
>> > Query:     1
>> > PYPAMNQLREKDPVNETPVGPWRISRHADVVDVFRNAPTSQTLADGSSPNMDDQDRRGSF 60
>> >              PYP +  LRE DPVN TPVG WRISR+ DV  VF +APTS T   G SPN D  D
>> > +GSF
>> > Sbjct:    28
>> > PYPKLAHLRENDPVNLTPVGTWRISRYEDVKAVFNDAPTSMTDKLGDSPNFDPLDTKGSF 87
>> >
>> > Query:    61
>> > RDFMLNMDGPEHARLRRLVLGAFTPKALKHIEGEIDRVVDEAMHTALKQGGMEVVEDFAL 120
>> >               +F+LN DG  H RLR LV  +F  K ++ +E E+ + V  A   A   GGM+VV
>> A
>> >
>> > Sbjct:    88
>> > LEFVLNKDGDAHRRLRMLVQKSFGQKTVRLMEEEVAKTVAAAFDKAQADGGMDVVPALAH 147
>> >
>> > Query:   121
>> > RIPSRMICRIMGLPEEDIDQFTVWTAARTNAFFARFLPEDVVEHTRQAGEQMADYFEAQI 180
>> >               +PSRMIC+IMG+P +D   F  WTAARTNAFFA+FLP DV E TR AG  M DYF
>> A
>> > I
>> > Sbjct:   148
>> > EVPSRMICQIMGVPMQDRQIFNEWTAARTNAFFAKFLPPDVQERTRNAGAAMEDYFRALI 207
>> >
>> > Query:   181
>> > KLRRANPREDLLTNLIQSEEKGDRLGDVELAIQAIGLLIAGFETTIGLIGNGTKALIENP 240
>> >                R+ +  +DLL+++I + E GD+  D EL IQAIG+++AG+ETTIGL+GNGT+A
>> > +E+P
>> > Sbjct:   208
>> > AERKRDLGDDLLSSMIMASEGGDKFTDDELIIQAIGVIVAGYETTIGLLGNGTRAFVEHP 267
>> >
>> > Query:   241
>> > DQAELLKQNPDLAKNTVEECLRYDTPVLFNWRVLTEPYEVGGKTLPENAVLWMMLGAANH 300
>> >              DQ   L+ NP+L  N  +ECLRYDTP+LFNWRVL EPYE+ G TLP  AV+W
>> +LGAAN
>> >
>> > Sbjct:   268
>> > DQLAKLRNNPELVSNATDECLRYDTPILFNWRVLEEPYELSGVTLPAEAVIWQLLGAANR 327
>> >
>> > Query:   301 DPRVHDDPDTMDITRQGISHASFGGG 326
>> >              DP    DPD  DI R+ ++H SFGGG
>> > Sbjct:   328 DPARFADPDQFDIEREDVAHQSFGGG 353
>> >
>> > >37% to CYP107L2 SAV1987 AP005029 Streptomyces avermitilis
>> > 2103582 DLHSYAFESNPEPTLAWLREHDPVHWSQHGYWFVTRYEDVRAVLGDPARFSSQKAGFGA
>> > 2103406
>> > 2103405 NNPIGKDAKGPEGKSGKKASDAEKTMSKGLALSFNQQDPPDHSRVRKLVNQAFSRREISE
>> > 2103226
>> > 2103225 RADKIQAVVDALMADVKAKGEFDLITDFAFHLPIIVASDIIGIPAEDRDLFRRNFELAA
>> > 2103049
>> > 2103048 RLMAPKRSDEEWAEALTGAKWQSTYMGELIASRAREPRADLISALIQTSEDDQKLT
>> 2102881
>> >
>> > 2102880 GGEVASAIMTIFTAAGTTTERMISSGAFLLLTHPEQLAALRADHSLMDNVLEEILRFHHP
>> > 2102701
>> > 2102700 NQSTSTNRRATQDVELGGKTIRAGDTVRVSLGSANRDAAQFDEPDAFNIQRTGTKHMSFG
>> > 2102521
>> > 2102520 FGIHFCLGSALARYETKAALEALL 2102449
>> >
>> > >57% to CYP153A2 Caulobacter crescentus at C-term
>> > 70% to seawater bacterial sequence JCVI_PEP_1096681995831
>> > 2416172 MTTANQTSPNGAIDVNDIPLAELDVSQPHLFKNDTWRPWFARLRAEAPVHYLADSENG
>> > 2416345
>> > 2416346 PFWSVTSHDMTKAVDANHKVFSSEEGGIAIVDPQPLDGEQLMRDPSFISMDEPKHATQRK
>> > 2416525
>> > 2416526 AVSPAVAPKNLAELEPLIRERAADILDNLPVGETFNWVDRVSVELTARMLATLFDFPYER
>> > 2416705
>> > 2416706 RRDLIRWSDVATAVPKVTGEANDMGARRDALIECATTFYQLWQERAAQPPKFDFVSM
>> > 2416876
>> > 2416877 LAHGEATKHLSEDPLLMLGNIILLIVGGNDTTRNSISGGVVALNQYPEEYQKLRDTPAL
>> > 2417053
>> > 2417054 IPNMVAETVRWQTPVIHMRRTALEDVELGGKTIRKGDKVVMWYLSGNRDEAVFPDADRLI
>> > 2417233
>> > 2417234 IDRPNARQHVSFGFGVHRCMGNRLAEMQLRVLWEEIMKRFHTVEVVGEVERLSNNFI
>> > 2417404
>> > 2417405 RGIASVPVRL 2417434
>> >
>> > >JCVI_PEP_1096681995831   /source_dna_id=JCVI_ORF_1096681995830
>> > /offset=0
>> >             /translation_table=11 /length=418 /full_length=418
>> >             Length = 418
>> >
>> >  Score = 1567 (551.6 bits), Expect = 5.9e-163, P = 5.9e-163
>> >  Identities = 290/409 (70%), Positives = 346/409 (84%)
>> >
>> > Query:    13
>> > IDVNDIPLAELDVSQPHLFKNDTWRPWFARLRAEAPVHYLADSENGPFWSVTSHDMTKAV 72
>> >              ID +  PL ELDVS P  ++NDTWRP FARLR EAPVHYL+DS NGPFWSVTSH +
>> K
>> > V
>> > Sbjct:     7
>> > IDNSSGPLRELDVSLPEHYENDTWRPMFARLRKEAPVHYLSDSVNGPFWSVTSHALIKEV 66
>> >
>> > Query:    73
>> > DANHKVFSSEEGGIAIVDPQPLDGEQLMRDPSFISMDEPKHATQRKAVSPAVAPKNLAEL 132
>> >              DAN+ +FSSE+GGI+IVD +P++G+  ++  +FI+MDEP+H+ QR AV+P+VAPKNL
>> > EL
>> > Sbjct:    67
>> > DANNSIFSSEKGGISIVDLKPVEGQ--VQGKNFIAMDEPEHSIQRSAVAPSVAPKNLVEL 124
>> >
>> > Query:   133
>> > EPLIRERAADILDNLPVGETFNWVDRVSVELTARMLATLFDFPYERRRDLIRWSDVATAV 192
>> >              EPLIRERA DIL+NLPVGETFNWV  VS+ELTARML T+ DFPY++R
>> L++WSD+AT
>> > V
>> > Sbjct:   125
>> > EPLIRERAVDILENLPVGETFNWVQEVSIELTARMLTTILDFPYDQRHKLVQWSDLATDV 184
>> >
>> > Query:   193
>> > PKVTG-EANDMGARRDALIECATTFYQLWQERAAQPPKFDFVSMLAHGEATKHLSEDPLL 251
>> >              P+VTG E  DM AR D L+ CA  FYQLW  ++ QPP FD +SML +   T  ++ED
>> > L
>> > Sbjct:   185
>> > PQVTGKEGTDMQARYDELMNCAAAFYQLWVSKSGQPPSFDLISMLQNNPDTARMNEDMEL 244
>> >
>> > Query:   252
>> > MLGNIILLIVGGNDTTRNSISGGVVALNQYPEEYQKLRDTPALIPNMVAETVRWQTPVIH 311
>> >               LGN++LLIVGGNDTTRNSISGGV+ALNQYP+EYQKLRD PALIPNMV+E
>> > +RWQTPVIH
>> > Sbjct:   245
>> > FLGNMLLLIVGGNDTTRNSISGGVMALNQYPDEYQKLRDNPALIPNMVSEIIRWQTPVIH 304
>> >
>> > Query:   312
>> > MRRTALEDVELGGKTIRKGDKVVMWYLSGNRDEAVFPDADRLIIDRPNARQHVSFGFGVH 371
>> >              MRRTALED ELGG+ I+KG+KV+MWYLSGNRDE+VF D DRLIIDRPNAR
>> > HV+FGFGVH
>> > Sbjct:   305
>> > MRRTALEDYELGGQHIKKGEKVIMWYLSGNRDESVFEDPDRLIIDRPNARSHVAFGFGVH 364
>> >
>> > Query:   372 RCMGNRLAEMQLRVLWEEIMKRFHTVEVVGEVERLSNNFIRGIASVPVRL 421
>> >              RCMGNR+AE+QLRVLWEEIM+RFHT+EVVG++ RL NNFIRGI  VPVR+
>> > Sbjct:   365 RCMGNRMAELQLRVLWEEIMERFHTIEVVGDITRLPNNFIRGIKEVPVRV 414
>> >
>> > >58% to CYP153B4 Rhodopseudomonas palustris NZ_AAAF01000001 gene =
>> > Rpal2887
>> > 2455381 FPIFEKMRAEEPVHYCAESTYGPYWSVTRYEDIMAVDTNHQVYSSEADFGGIVID
>> 2455217
>> > 2455216 DRIAIDPETNYKSASFISMDQPKHDDQRKSVNGITNPNNLQYFGDIIRTRTVNMLDSLPV
>> > 2455037
>> > 2455036 GEEFDWVPTVSIELTTQMLATLFDFPFEDRHKLTRWSDVITAEPESDIVENQE 2454878
>> > 2454877 ARVAELNEMAEYFVELQKGRINKPDSIDLLTMMTHSPAMAKMPPEEFMGNLALLIVGGN
>> > 2454701
>> > 2454700 DTTRNSMSGSIFGMHLFPDEFKKMVDDPSLTDNAVAEIIRWQTPLSHMRRTALQDAVL
>> > 2454527
>> > 2454526 GGKQIRKGDKVVMWYASGNRDTSIFDDPDKIIIDRKNARRHLSFGFGIHRCMGNRIGELQ
>> > 2454347
>> > 2454346 LRILWEEILKRFSRVEVTGEPVLTHSNFVKGYASLPVKL 2454230
>> >
>> > >59% to CYP153A2 Caulobacter crescentus CB15 GenPept AAK22050
>> > 2456653 FPIFEQMRQEDPVHYCAESTYGPYWSVTRYEDIMAVDTNHHVYSSDAHLGGIIID
>> 2456489
>> > 2456488 DGIQNDPENDFKAVNFIAMDKPKHDEQRKSVNGITNPNNLQHFGEIIRKRTSNLLDSLPV
>> > 2456309
>> > 2456308 GEEFDWVSTVSIELTTQMLATMFDFPFEDRHKLTRWSDVSTAEPGSGIVETQQQRI
>> 2456141
>> > 2456140 DELMEMAAYFSDLQQSRKDKPDNIDLLTMMTHSPAMANMPPEEFLGNLSLLIVGGNDTT
>> > 2455964
>> > 2455963 RNSMTGGVFGFSLFPEQWDKMVADPTLIDNAVAEIIRWQTPLAHMRRTALEDAILGG
>> > 2455793
>> > 2455792 KQIRKGDKVVMWYASGNRDTSIFDDPDKIIIDRKNARRHLSFGFGIHRCMGNRIGELQLR
>> > 2455613
>> > 2455612 ILWEEVLKRFSRIEVTGEPELTNSNFVKGYTSLPVKL 2455502
>> >
>> > >68% to CYP153A2 Caulobacter crescentus CB15 GenPept AAK22050
>> > 2728456 PYFERLRKEAPVHKAYSPDFGEYWSVTRYEDIMAVDTNHHVFSSSWEHGGITLFDQISDF
>> > 2728635
>> > 2728636 QLPMFIAMDPPKHDQQRITVQPIVAPNNLKNWEGLIRERTGQILDSLPRGEVFDWVDNVS
>> > 2728815
>> > 2728816 VELTTMMLATLFDFPFEQRRKLTRWSDVATGRNNPEIVADDDQWRAELLECLEAFT
>> 2728983
>> > 2728984 DIWNERINSDTPGNDLITMLTRGESTKNMDPMEYLGNIILLIVGGNDTTRNSMTASVYA
>> > 2729160
>> > 2729161 LNKFAGEYDKLLAKPDLIPNLSSEIIRWQTPLAHMRRTALEDIVLNGAHIKKGDKVAMWY
>> > 2729340
>> > 2729341 VSGNRDESVFEDADKVIIDRPNARRQMSFGYGIHRCVGNRLGELQIKILWEEILKRFPKI
>> > 2729520
>> > 2729521 EVMEEPTRTKSVFVKGYTYMPVRI 2729592
>> >
>> > >71% to CYP153A2 Caulobacter crescentus CB15 GenPept AAK22050
>> > 2729740 MPLEDINVADGALFQDDAIWPYFERLRKEAPVHKGHSDEFGDYWSVTRYEDIMAVDTNHH
>> > 2729919
>> > 2729920 VFSSEGAITLADPLEDFRAPMFIAMDPPKHDKQRITVQPIVAPKNLQNWEGLIRERTGLI
>> > 2730099
>> > 2730100 LDQLPRNETFDWVDKVSIELTTMMLATLFDFPFEERRRLTRWSDVATGRDNPEIYK
>> 2730267
>> > 2730268 SEEQWRGELMECLEAFTGLWNDRVNSDTPGNDLISMLASGESTKNMDPMEYLGNIILLI
>> > 2730444
>> > 2730445 VGGNDTTRNSMTGSVYALNKFAGEYDKLIADPSLIPNLSSEIIRWQTPLAHMRRTALEDI
>> > 2730624
>> > 2730625 ELNGQMIKKGDKVAMWYVSGNRDTAVFENADDVIIDRPNARRQMSFGYGIHRCVGNRLGE
>> > 2730804
>> > 2730805 LQIKILWEELLKRFPKIEVMEEPTRTRSPFVKGYTYMPVRI 2730927
>> >
>> > >68% to CYP153A2 Caulobacter crescentus CB15 GenPept AAK22050
>> > 2731076 MPLNEINPARRDLFQNDVIWPYFERLRKEAPVHKAYDEDFGEYWSVTRYEDIMAVDTNHH
>> > 2731255
>> > 2731256 VFSSDWTNGGITLFDAAEDFRLPMFIAMDPPKHDQQRITVQPIVAPNNLKNWEGLIRERT
>> > 2731435
>> > 2731436 AYVLDSLPRGETFDWVDNVSIELTTMMLATLFDFPFEERRKLTFWSDMVTTDPKT
>> 2731600
>> > 2731601 LEGGVEEKRGHLLACLEYFTGLWNERINSDTPGNDLITMLTRGESTKNMDPMEYLGNI
>> > 2731774
>> > 2731775 ILLIVGGNDTTRNSMTASVYGLNKFPGEYDKLIADPSLIPNLSSEIIRWQTPLAHMRRTA
>> > 2731954
>> > 2731955 LEDIELNGTMIKKGDKVAMWYVSGNRDADVFENADDIIIDRPNARRQMSFGYGIHRCVGN
>> > 2732134
>> > 2732135 RLGELQIKILWEEILKRFPKIELMEEPTRTPGCFVKGYTYMPVRI 2732269
>> >
>> > >68% to CYP153A2 Caulobacter crescentus CB15 GenPept AAK22050
>> > 2732420 MPLNEFNPAQRDLFQNDVIWPYFERLREEAPVHKCFDEEFGEYWSVSSYEHIMAVDTNHQ
>> > 2732599
>> > 2732600 VFSSSWEHGGITLFDGPEDFQLPMFIAMDQPKHDEQRKTVQPIVAPNNLKSWEPLIRERT
>> > 2732779
>> > 2732780 GMVLDSLPRGETFDWVDNVSIELTTMMLATLFDFPFEDRRKLTFWSDMVTTDAN 2732941
>> > 2732942 TLEGGEEEWKGHLLECLAYFTELWNQRINSDKPGNDLITMLTRGEATKNMDPMEYLGN
>> > 2733115
>> > 2733116 IILLIVGGNDTTRNSMTGSVYALNKFPTEYDKLIADPGLIPNLSSEIIRWQTPLAHMRRT
>> > 2733295
>> > 2733296 ALEDFELGGKMIKKGDKVAMWYVSGNRDKTVFENADDVIIDRANARRQMSFGYGIHRCVG
>> > 2733475
>> > 2733476 NRLGELQIMILWEEILKRFPKIELMAEPTRSPGCFVKGYTYMPVRI 2733613
>> >
>> > >33% to CYP107L1 AF087022  Streptomyces venezuelae cytochrome P. =
>> pikC
>> > gene
>> > AF079139, 50% to seawater bacterial sequence JCVI_PEP_1096680833653
>> > 302400 PYPLYTRLRPHAPVQGYRDYPPGTVPGEDEPVNAWVLLDYDQVSKAARDHRTFSSRDPLQ
>> > 302221
>> > 302220 EGSSAPTLMLVNHDNPEHDRLRNIVNLAFSRKRIEELSPYVSKMVHTLLDEVESASGGKV
>> > 302041
>> > 302040 EAMSDICAALPARVMVHLLGLPNEIAAKFRHWGTAFMLSADLTPEERQTSNV 301885
>> > 301884 ELYTYFVEQVTAMDEALAAGKDVPDSLMRALLTAEADGEKLTRDEVIRFCLTLVVAGAE
>> > 301708
>> > 301707 TTTFLLGNLLHHLATMPEMTERLRANRDDIEGFMNESLRHSGPPQRLFRIAEADVEVGGQ
>> > 301528
>> > 301527 QIRKGDWVALFFAAANHDPAMFPDPEKFDIDRTNLNKQLTFGVGVHHCLGSALAKAEARE
>> > 301348
>> > 301347 LMNALL 301330
>> >
>> > >JCVI_PEP_1096680833653   /source_dna_id=JCVI_ORF_1096680833652
>> > /offset=0
>> >             /translation_table=11 /length=250 /full_length=250
>> >             Length = 250
>> >
>> >  Score = 586 ( 206.3 bits), Expect = 5.3e-59, P = 5.3e-59
>> >  Identities = 118/234 (50%), Positives = 160/234 (68%)
>> >
>> > Query:     1
>> > AFPYPLYTRLRPHAPVQGYRDYPPGTVPGEDEPVNAWVLLDYDQVSKAARDHRTFSSRDP 60
>> >              A PYP Y +LR  +PV GY D PPGTVPG+DEP  +W +L +  V +AARD +TFSS
>> > DP
>> > Sbjct:    20
>> > ANPYPYYDQLRAASPVHGYVDLPPGTVPGQDEPKISWAVLRHADVVEAARDAQTFSSADP 79
>> >
>> > Query:    61
>> > LQEGSSAPTLMLVNHDNPEHDRLRNIVNLAFSRKRIEELSPYVSKMVHTLLDEVESASGG 120
>> >              LQ  S+APTLMLVN D P H +LR I + AF+ +RI E  P+V+++   +L    +
>> > GG
>> > Sbjct:    80
>> > LQAESTAPTLMLVNDDPPRHSKLRAIAHKAFTPRRILEKGPWVAQVAAEIL----APCGG 135
>> >
>> > Query:   121
>> > KV-EAMSDICAALPARVMVHLLGLPNEIAAKFRHWGTAFMLSADLTPEERQTSNVELYTY 179
>> >              +  + M+D+   LP RVM  ++G+ +  A +FR+W TAFMLSADLTP  R+ SN E+
>> > +
>> > Sbjct:   136
>> > RCFDFMTDVAPVLPTRVMAKVIGVDDAQAPRFRNWATAFMLSADLTPAAREASNREVAAF 195
>> >
>> > Query:   180 FVEQVTAMDEALAAGKDVPDSLMRALLTAEADGEKLTRDEVIRFCLTLVVAGAET
>> 234
>> >              FV+ V      +  G D PD L+ AL+  ++DG++LTRDEV RFC+TL+VAGAET
>> > Sbjct:   196 FVDHVNRRYALIERGGDPPDDLVTALILEDSDGQRLTRDEVTRFCITLLVAGAET
>> 250
>> >
>> > >48% to CYP191A1 Caulobacter crescentus CB15 GenPept AAK22930
>> > 75% to seawater bacterial sequence JCVI_PEP_1096682145269
>> > 3343360 PHEFYKTMRESAPVMWSDIRKGGDGFWSVSRYDDLKAVELAPTVFSSERGSINLGVAPKD
>> > 3343181
>> > 3343180 KWKPEKLVSAALNALINLDAPRHMEMRIQQMDFFAPAYVATLRDKVSAKIDSLLDDMESK
>> > 3343001
>> > 3343000 GPVVDMVPVFSEQLPLFTLCEMLGVDEEDRPKIAHWMHYLELASQYLTNPWQVIIKEPLF
>> > 3342821
>> > 3342820 PFRFFKAVKDMFAYGEAIMADRRANPREDLLTAIAKTKLSDEELPQEFL 3342674
>> > 3342673 DGSWLLIIFAGNDTSRNSLSGTIRLMTEFPDQRQMVLDDPSLIPRMSQEALRMISPVRHM
>> > 3342494
>> > 3342493 RRTAVEDTEINGQRIAKDEKVVLWYGAANRDPSMFPDPDRFDMMRDSVDKHLAFGHGVHK
>> > 3342314
>> > 3342313 CLGSRIAQMQ 3342284
>> >
>> > >JCVI_PEP_1096682145269   /source_dna_id=JCVI_ORF_1096682145268
>> > /offset=0
>> >             /translation_table=11 /length=367 /full_length=367
>> >             Length = 367
>> >
>> >  Score = 1279 (450.2 bits), Expect = 1.9e-132, P = 1.9e-132
>> >  Identities = 236/312 (75%), Positives = 273/312 (87%)
>> >
>> > Query:    48
>> > ERGSINLGVAPKDKWKPEKLVSAALNALINLDAPRHMEMRIQQMDFFAPAYVATLRDKVS 107
>> >              +RGSIN+ V  +  WKPEKL  AA N+LINLDAP HMEMR+QQ +FF PAY+
>> TLRDKV
>> >
>> > Sbjct:     1
>> > QRGSINMMVGDRKLWKPEKLAPAAFNSLINLDAPAHMEMRMQQSEFFFPAYIETLRDKVE 60
>> >
>> > Query:   108
>> > AKIDSLLDDMESKGPVVDMVPVFSEQLPLFTLCEMLGVDEEDRPKIAHWMHYLELASQYL 167
>> >              AKID++LD++E +GPVVD   +FSE+LPLFTLCEMLG+DEEDRP+I  WMH+LELA
>> > Q+L
>> > Sbjct:    61
>> > AKIDAMLDELERQGPVVDFAKLFSEELPLFTLCEMLGIDEEDRPRIKLWMHHLELAGQFL 120
>> >
>> > Query:   168
>> > TNPWQVIIKEPLFPFRFFKAVKDMFAYGEAIMADRRANPREDLLTAIAKTKLSDEELPQE 227
>> >               NPWQ  + EP+FPFRF K V++MFA+GE IM DRRANPR+DLLT IA++KL  E
>> > LPQE
>> > Sbjct:   121
>> > ANPWQTFLSEPMFPFRFNKVVQEMFAFGERIMKDRRANPRDDLLTVIAQSKLEGELLPQE 180
>> >
>> > Query:   228
>> > FLDGSWLLIIFAGNDTSRNSLSGTIRLMTEFPDQRQMVLDDPSLIPRMSQEALRMISPVR 287
>> >              +LDGSWLLIIFAGNDTSRNSLSGTIRLMTEFP QR
>> +VLDDPSLIP+MS+EALRM+SPV
>> >
>> > Sbjct:   181
>> > YLDGSWLLIIFAGNDTSRNSLSGTIRLMTEFPTQRTLVLDDPSLIPQMSEEALRMVSPVI 240
>> >
>> > Query:   288
>> > HMRRTAVEDTEINGQRIAKDEKVVLWYGAANRDPSMFPDPDRFDMMRDSVDKHLAFGHGV 347
>> >              HMRRTAVEDTEINGQ IAKDEKVVLWYGAANRDP +FPDPD F++
>> > RD+V+KHLAFGHGV
>> > Sbjct:   241
>> > HMRRTAVEDTEINGQPIAKDEKVVLWYGAANRDPDIFPDPDTFNLHRDNVEKHLAFGHGV 300
>> >
>> > Query:   348 HKCLGSRIAQMQ 359
>> >              HKCLGSRIA+MQ
>> > Sbjct:   301 HKCLGSRIAKMQ 312
>> >
>> >
>> > --
>> > David R. Nelson
>> > Associate Professor
>> > Dept. of Molecular Sciences
>> > 858 Madison Ave. Suite G01
>> > University of Tennessee
>> > Memphis TN 38163
>> > (901) 448-8303 phone
>> > (901) 448-7360 fax
>> > dnelson@utmem.edu
>> >
>> >
>> > --
>> > ==============================================================
>> > Lieven Sterck                              Predoctoral fellow
>> >
>> > Tel:+32 (0)9 3313821                       Fax:+32 (0)9 3313809
>> >
>> > VIB Department of Plant Systems Biology, UGent
>> > Bioinformatics and Evolutionary Genomics Division
>> > Technologiepark 927,         B-9052 Gent,             Belgium
>> > Email:
>> > lieven.sterck@psb.ugent.be
>> > Website: http://bioinformatics.psb.ugent.be
>> >
>> > --------------------------------------------------------------
>> > Algal Genetics Group
>> > UMR 7139 CNRS-UPMC
>> > Végétaux Marins et Biomolécules (Marine Plants and Biomolecules)
>> > Station Biologique
>> > Place Georges Teissier, BP74
>> >
>> > 29682 Roscoff Cedex, France
>> > Website: http://www.sb-roscoff.fr/UMR7139/en/genetics.html
>> >
>> > ==============================================================
>> >
>> >
>>
>>
>> --
>> David R. Nelson
>> Associate Professor
>> Dept. of Molecular Sciences
>> 858 Madison Ave. Suite G01
>> University of Tennessee
>> Memphis TN 38163
>> (901) 448-8303 phone
>> (901) 448-7360 fax
>> dnelson@utmem.edu
>>
>
>
>
> --
> David R. Nelson
> Associate Professor
> Dept. of Molecular Sciences
> 858 Madison Ave. Suite G01
> University of Tennessee
> Memphis TN 38163
> (901) 448-8303 phone
> (901) 448-7360 fax
> dnelson@utmem.edu
>



--

David R. Nelson
Associate Professor
Dept. of Molecular Sciences
858 Madison Ave. Suite G01
University of Tennessee
Memphis TN 38163
(901) 448-8303 phone
(901) 448-7360 fax
dnelson@utmem.edu



--
David R. Nelson
Associate Professor
Dept. of Molecular Sciences
858 Madison Ave. Suite G01
University of Tennessee
Memphis TN 38163
(901) 448-8303 phone
(901) 448-7360 fax
dnelson@utmem.edu

-- 
==============================================================
Lieven Sterck                              Predoctoral fellow

Tel:+32 (0)9 3313821                       Fax:+32 (0)9 3313809

VIB Department of Plant Systems Biology, UGent
Bioinformatics and Evolutionary Genomics Division
Technologiepark 927,         B-9052 Gent,             Belgium
Email: 
lieven.sterck@psb.ugent.be
Website: http://bioinformatics.psb.ugent.be

--------------------------------------------------------------
Algal Genetics Group
UMR 7139 CNRS-UPMC
Végétaux Marins et Biomolécules (Marine Plants and Biomolecules)
Station Biologique 
Place Georges Teissier, BP74

29682 Roscoff Cedex, France
Website: http://www.sb-roscoff.fr/UMR7139/en/genetics.html

==============================================================



--
David R. Nelson
Associate Professor
Dept. of Molecular Sciences
858 Madison Ave. Suite G01
University of Tennessee
Memphis TN 38163
(901) 448-8303 phone
(901) 448-7360 fax
dnelson@utmem.edu
-- 
==============================================================
Lieven Sterck                              Predoctoral fellow

Tel:+32 (0)9 3313821                       Fax:+32 (0)9 3313809
VIB Department of Plant Systems Biology, UGent
Bioinformatics and Evolutionary Genomics Division
Technologiepark 927,         B-9052 Gent,             Belgium
Email: lieven.sterck@psb.ugent.be
Website: http://bioinformatics.psb.ugent.be

--------------------------------------------------------------
Algal Genetics Group
UMR 7139 CNRS-UPMC
Végétaux Marins et Biomolécules (Marine Plants and Biomolecules)
Station Biologique 
Place Georges Teissier, BP74
29682 Roscoff Cedex, France
Website: http://www.sb-roscoff.fr/UMR7139/en/genetics.html
==============================================================