Hi David,
I just received the cleaned ESTs a couple of days ago.
I will make this set also available on the blast page.
regards,
lieven
David Nelson wrote:
Dear Lieven,
How can I Blast ESTs for Ectocarpus?
David
On 10/22/07, David Nelson <drnelson1@gmail.com
> wrote:
Dear
Lieven,
Thanks for the tip. I had been clicking on the name next to the F. I
thought they were both the same, I did not try the F. It worked.
David
On 10/22/07,
Lieven Sterck <lieven.sterck@psb.ugent.be>
wrote:
Hi
David,
I could send you the scaffolds.
But you should be able to extract them yourself: on the blast result
page
click on the 'F' left of the sequence name (next to the '>' sign)
this
will lead you to the actual sequence of the hit.
I will get you an invitation to the annotation site first thing in the
morning.
don't hesitate to contact me if something is unclear!!
regards,
lieven
> Dear Lieven,
>
> There are only 10 contigs with P450 sequences.
> Some are pretty short weak matches and may be accidental.
>> F sctg_6 CYP51C1
>> F sctg_63 CYP97E3 and CYP97E4
>> F sctg_25 CYP97F-like seq
>> F sctg_362 most like CYP5021A1
>> F sctg_1 bacterial chromosome with 14 P450s, all assembled
>> F sctg_10 C-helix and I-helix parts
>> F sctg_471 2 sequences only partially assembled
>> F sctg_60 2 sequences only have the PERF motif (accidental?)
>> F sctg_193 only the I-helix motif
>> F sctg_81 poor match may be accidental (heme signature region)
>
> The current version of the blast server will not return any DNA
sequence
> to
> me. I am not given permission to see the genome yet, so could you
please
> send me 9 of the contigs above so I can continue to assemble these
P450
> genes. I do not need the sctg_1 since I have already completed
those 14
> bacterial P450s.
>
> If the files are too large,
> I only need the parts of them shown below
>
>> F sctg_6 1,170,000-1,200,000
>> F sctg_10 780,000-820,000
>> F sctg_25 950,000-970,000
>> F sctg_60 400,000-460,000
>> F sctg_63 330,000-370,000
>> F sctg_81 170,000-220,000
>> F sctg_193 150,000-200,000
>> F sctg_362 70,000-90,000
>> F sctg_471 30,000-80,000
>
> Thanks.
>
> David Nelson
>
>
> On 10/22/07, David Nelson <drnelson1@gmail.com
> wrote:
>>
>> Dear All,
>>
>> Can you make the pull down choices for the expect value in the
blast
>> server go up to 100 rather than stopping at 10? This may be
helpful for
>> finding distantly related exons.
>>
>> David
>>
>> On 10/22/07, lieven sterck <lieven.sterck@psb.ugent.be>
wrote:
>> >
>> > Hi David,
>> >
>> > That's correct! Scaffold sctg_1 is indeed from a
bacterial symbiont
>> > found associated with (or within) ectocarpus.
>> > There are some more scaffolds from this organism, a list:
>> > sctg_1
>> > sctg_1222
>> > sctg_1301
>> > sctg_1320
>> > sctg_1403
>> > sctg_1415
>> > sctg_1432
>> > sctg_1769
>> > sctg_1876
>> > sctg_397
>> > sctg_627
>> > sctg_822
>> > sctg_853
>> > sctg_870
>> > sctg_886
>> > sctg_893
>> >
>> > these scaffolds have recently been removed from the
official assembly.
>> > GenoScope (responsible for the sequencing) is finishing
and analyzing
>> > this bacterial symbiont.
>> >
>> > best regards,
>> > lieven
>> >
>> >
>> > David Nelson wrote:
>> >
>> > Dear all,
>> >
>> > I have begun looking over the P450s in Ectocarpus.
>> > The biggest surprise was sctg_1 which had 14 P450s but
they seem to be
>> > bacterial.
>> >
>> > Here are the 14 P450s with their best matches. I
compared them to
>> 3305
>> > bacterial sequences from the Global Ocean Sequencing
Project at the J
>> Craig
>> > Venter Institute. Some were a better match to those
seawater
>> sequences than
>> > to named P450s.
>> >
>> > Since the whole contig of 3.4 Mb seems to be bacterial
and often like
>> > Caulobacter, I searched Genbank for a rRNA from
Caulobacter. I found
>> one
>> > from a
>> >
>> > Caulobacter endosymbiont of Tetranychus urticae
>> > AY753176
>> > ribosomal RNA gene, partial sequence.
>> >
>> > I used this for a blastn search of Ectocarpus.
>> > I found a 90% match on the same contig as the 14 P450s
(see below)
>> > There is a Caulobacter-like endosymbiont or contaminant
in the algae
>> > genome data.
>> >
>> > I blasted the Ectocarpus bacterial rRNA seq against
Genbank and found
>> > a Sargasso Sea bacterioplankton as the best match
>> >
>> > >F sctg_1
>> > 3017346 aaagatttatcgcccctggatgggcccgcgt
tggattagctagttggtggggtaatggcc
>> > 3017405
>> > 3017406
taccaaggcgacgatccatagctggtctgagaggatgatcagccacactggaactgagac
>> > 3017465
>> > 3017466
acggtccagactcctacgggaggcagcagtggggaatattggacaatgggcgcaagcctg
>> > 3017525
>> > 3017526
atccagccatgccgcgtgagtgatgaaggccctagggttgtaaaactctttcagtggtga
>> > 3017585
>> > 3017586
agataatgacggtaaccacagaagaagctccggctaactccgtgccagcagccgcggtaa
>> > 3017645
>> > 3017646
tacggagggagctagcgttgttcggaattactgggcgtaaagcgcacgtaggcggtctat
>> > 3017705
>> > 3017706 aaagttgggggtgaaatcccggagctcaactccggaactgcct
3017748
>> >
>> > only 6 nucleotide diffs to
>> > >gb|AY162106.1| Alpha proteobacterium GMD21A06 small
subunit
>> ribosomal
>> > RNA gene,
>> > partial sequence
>> > Length=1326
>> >
>> > isolation_source="Sargasso Sea bacterioplankton
>> >
>> > Score = 710 bits (384), Expect = 0.0
>> > Identities = 397/403 (98%), Gaps = 1/403 (0%)
>> > Strand=Plus/Plus
>> >
>> > Query 1
>> >
AAAGATTTATCGCCCCTGGATGGGCCCGCGTTGGATTAGCTAGTTGGTGGGGTAATGGCC 60
>> >
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> > Sbjct 174
>> >
AAAGATTTATCGCCCCTGGATGGGCCCGCGTTGGATTAGCTAGTTGGTGGGGTAATGGCC 233
>> >
>> > Query 61
>> >
TACCAAGGCGACGATCCATAGCTGGTCTGAGAGGATGATCAGCCACACTGGAACTGAGAC 120
>> >
||||||||||||||||||||||||||||||||||||||||||||||||||| ||||
>> |||
>> > Sbjct 234
>> >
TACCAAGGCGACGATCCATAGCTGGTCTGAGAGGATGATCAGCCACACTGGGACTGTGAC 293
>> >
>> > Query 121
>> >
ACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGCGCAAGCCTG 180
>> > ||||
|||||||||||||||||||||||||||||||||||||||||||||||||||
>> |||
>> > Sbjct 294
>> >
ACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGCGCAAG-CTG 352
>> >
>> > Query 181
>> >
ATCCAGCCATGCCGCGTGAGTGATGAAGGCCCTAGGGTTGTAAAACTCTTTCAGTGGTGA 240
>> > ||||||||||||||||||||||||||||||||||||||||||||
>> |||||||||||||||
>> > Sbjct 353
>> >
ATCCAGCCATGCCGCGTGAGTGATGAAGGCCCTAGGGTTGTAAAGCTCTTTCAGTGGTGA 412
>> >
>> > Query 241
>> >
AGATAATGACGGTAACCACAGAAGAAGCTCCGGCTAACTCCGTGCCAGCAGCCGCGGTAA 300
>> >
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> > Sbjct 413
>> >
AGATAATGACGGTAACCACAGAAGAAGCTCCGGCTAACTCCGTGCCAGCAGCCGCGGTAA 472
>> >
>> > Query 301
>> >
TACGGAGGGAGCTAGCGTTGTTCGGAATTACTGGGCGTAAAGCGCACGTAGGCGGTCTAT 360
>> > |||||||||||||||||||||||||||||||||||||||||||||
>> ||||||||||||||
>> > Sbjct 473
>> >
TACGGAGGGAGCTAGCGTTGTTCGGAATTACTGGGCGTAAAGCGCGCGTAGGCGGTCTAT 532
>> >
>> >
Query 361 AAAGTTGGGGGTGAAATCCCGGAGCTCAACTCCGGAACTGCCT 403
>> > |||||||||||||||||||||||||||||||||||||||||||
>> >
Sbjct 533 AAAGTTGGGGGTGAAATCCCGGAGCTCAACTCCGGAACTGCCT 575
>> >
>> >
>> > > F sctg_1 (14 P450s from a bacterial genome in the
Ectocarpus data)
>> > Length = 3415905
>> >
>> > >67% to CYP108B1 Caulobacter crescentus AE005918
>> > 158936
FAKLRKDTPLGVAGPQGFEPFWVVTKHKDIVEVERNNEVFHNGDKSTTLVDANTDTA
>> 158766
>> > 158765
VREMMGGSPHLIRSLVQMDNPDHKNYRGITAANFMPQELKALEVQVRKIAKSFVDHMEEL
>> > 158586
>> > 158585
GRKNDGRCDFAKDVAFLYPLHVIMELLGVPQSDEPKMLKLTQELFGAADPELNRTGKE
>> 158412
>> >
>> > 158411
RDDPKEALAALSGTVAEFVEYFTAVTEDRRKTPRADIASVIANGKVNGEAIGIFEAMGYY
>> > 158232
>> > 158231
IIVATAGHDTTSGTTAGTMWELAKDRQKFLQVKNDPKLIPLLVEESIRWVTPVKHFMRSA
>> > 158052
>> > 158051
TQDTVLGGQEIKKGDWMMLCYQSGNRDEDVFDDPFEFKVNRQPNRHIAFGHGAHVCLGQH
>> > 157872
>> > 157871 LARMEMRALWEELLPRLDSVELDGEPTRMLANFVCGPKSVPIK 157743
>> >
>> > >59% to CYP153A2 Caulobacter crescentus CB15 GenPept
AAK22050
>> > 1458133
LFERLRNEDPVHFFEHEEFGRFWSVTRHADIMSIDTNHQQFSSEPSIFLGNTNSDEDENF
>> > 1458312
>> > 1458313
NPATFIAMDPPKHDAQRNAVNPAVAPPALRDLEPLIRQRVSAVLDSLPIGETFNWVDLVS
>> > 1458492
>> > 1458493
IEITTQMLATLFDFPFEDRYMLTRWSDMTTANPETLAAMGLTIEDRRNAMYECLEIFG
>> > 1458666
>> > 1458667
GLYAERAQLPPANDFISLMAHNEDMKNLDPMNLLGNLVLLIVGGNDTTRNSMSGGVLAL
>> > 1458843
>> > 1458844
HENPAEFAKLKADPSIIPNMVSEIIRWQTPLAYMRRTANEDLEFRGKQIKQGDRIMMWYV
>> > 1459023
>> > 1459024
SGNRDERAIERPNEFLIDRENARRHLSFGFGIHRCMGNRVGEMQVRILWEEILKRFDRVE
>> > 1459203
>> > 1459204 VVGKPARTLSNFVMGFTELPVRL 1459272
>> >
>> > >50% to CYP108B1 Caulobacter crescentus AE005918
>> > 1535639
IFTTLRQDNPLACVEVPGYDPHWMVTKYSDVKEITRQDNLFHSGDRPKILHSQAGEA
>> > 1535809
>> > 1535810
LARSFTGGSPNLFQSLVQLDPPEHTAYRNVLQGEFMPGGIAKMKENVAKTAQEFVDQMAS
>> > 1535989
>> > 1535990
LAPTCDFADDVAMNYPLQVVLDIVGVPREHHPKMLQLTQWLFSYADPDLKRPGSDIT
>> > 1536160
>> > 1536161
DPEEIIKTWNVVFTQFHEFFMPLVEARRANPKEDIASLIANAKINGESMEERKMISYFG
>> > 1536337
>> > 1536338
ILATAGHDTTSATTALGMKMLAENPDMLARLKEKPDLIPSFVEECIRWGSPVQHFIRSAT
>> > 1536517
>> > 1536518
EDYVLRGQTIRKGDLLYISYLSANRDEEEFDDPFTFKMDRAPNRHVGFGFGGHVCLGQH
>> > 1536694
>> > 1536695 LARLEIRTIWQTLLPRLTEVELTGPVKFTESEFVCGPKSVPIR
1536823
>> >
>> > >41% to CYP107AA1 Bradyrhizobium japonicum USDA 110
GenPept BAC51802
>> > 56% to seawater bacterial sequence JCVI_PEP_1096696260773
>> > 2052084
PYPAMNQLREKDPVNETPVGPWRISRHADVVDVFRNAPTSQTLADGSSPNMDDQDRRGS
>> > 2052260
>> > 2052261
FRDFMLNMDGPEHARLRRLVLGAFTPKALKHIEGEIDRVVDEAMHTALKQGGMEVVEDFA
>> > 2052440
>> > 2052441
LRIPSRMICRIMGLPEEDIDQFTVWTAARTNAFFARFLPEDVVEHTRQAGEQMAD
>> 2052605
>> > 2052606
YFEAQIKLRRANPREDLLTNLIQSEEKGDRLGDVELAIQAIGLLIAGFETTIGLIGNGTK
>> > 2052785
>> > 2052786
ALIENPDQAELLKQNPDLAKNTVEECLRYDTPVLFNWRVLTEPYEVGGKTLPENAVLWMM
>> > 2052965
>> > 2052966
LGAANHDPRVHDDPDTMDITRQGISHASFGGGAHTCLGNQLARMEASRAFHAFVS
>> 2053130
>> >
>> > >JCVI_PEP_1096696260773
/source_dna_id=JCVI_ORF_1096696260772
>> > /offset=0
>> > /translation_table=11 /length=353
/full_length=353
>> > Length = 353
>> >
>> > Score = 956 (336.5 bits), Expect = 3.3e-98, P = 3.3e-98
>> > Identities = 183/326 (56%), Positives = 227/326 (69%)
>> >
>> > Query: 1
>> >
PYPAMNQLREKDPVNETPVGPWRISRHADVVDVFRNAPTSQTLADGSSPNMDDQDRRGSF 60
>> > PYP + LRE DPVN TPVG WRISR+ DV VF +APTS T
G SPN D D
>> > +GSF
>> > Sbjct: 28
>> >
PYPKLAHLRENDPVNLTPVGTWRISRYEDVKAVFNDAPTSMTDKLGDSPNFDPLDTKGSF 87
>> >
>> > Query: 61
>> >
RDFMLNMDGPEHARLRRLVLGAFTPKALKHIEGEIDRVVDEAMHTALKQGGMEVVEDFAL 120
>> > +F+LN DG H RLR LV +F K ++ +E E+ + V A
A GGM+VV
>> A
>> >
>> > Sbjct: 88
>> >
LEFVLNKDGDAHRRLRMLVQKSFGQKTVRLMEEEVAKTVAAAFDKAQADGGMDVVPALAH 147
>> >
>> > Query: 121
>> >
RIPSRMICRIMGLPEEDIDQFTVWTAARTNAFFARFLPEDVVEHTRQAGEQMADYFEAQI 180
>> > +PSRMIC+IMG+P +D F WTAARTNAFFA+FLP DV E
TR AG M DYF
>> A
>> > I
>> > Sbjct: 148
>> >
EVPSRMICQIMGVPMQDRQIFNEWTAARTNAFFAKFLPPDVQERTRNAGAAMEDYFRALI 207
>> >
>> > Query: 181
>> >
KLRRANPREDLLTNLIQSEEKGDRLGDVELAIQAIGLLIAGFETTIGLIGNGTKALIENP 240
>> > R+ + +DLL+++I + E GD+ D EL
IQAIG+++AG+ETTIGL+GNGT+A
>> > +E+P
>> > Sbjct: 208
>> >
AERKRDLGDDLLSSMIMASEGGDKFTDDELIIQAIGVIVAGYETTIGLLGNGTRAFVEHP 267
>> >
>> > Query: 241
>> >
DQAELLKQNPDLAKNTVEECLRYDTPVLFNWRVLTEPYEVGGKTLPENAVLWMMLGAANH 300
>> > DQ L+ NP+L N +ECLRYDTP+LFNWRVL EPYE+ G
TLP AV+W
>> +LGAAN
>> >
>> > Sbjct: 268
>> >
DQLAKLRNNPELVSNATDECLRYDTPILFNWRVLEEPYELSGVTLPAEAVIWQLLGAANR 327
>> >
>> > Query: 301 DPRVHDDPDTMDITRQGISHASFGGG 326
>> > DP DPD DI R+ ++H SFGGG
>> > Sbjct: 328 DPARFADPDQFDIEREDVAHQSFGGG 353
>> >
>> > >37% to CYP107L2 SAV1987 AP005029 Streptomyces
avermitilis
>> > 2103582
DLHSYAFESNPEPTLAWLREHDPVHWSQHGYWFVTRYEDVRAVLGDPARFSSQKAGFGA
>> > 2103406
>> > 2103405
NNPIGKDAKGPEGKSGKKASDAEKTMSKGLALSFNQQDPPDHSRVRKLVNQAFSRREISE
>> > 2103226
>> > 2103225
RADKIQAVVDALMADVKAKGEFDLITDFAFHLPIIVASDIIGIPAEDRDLFRRNFELAA
>> > 2103049
>> > 2103048
RLMAPKRSDEEWAEALTGAKWQSTYMGELIASRAREPRADLISALIQTSEDDQKLT
>> 2102881
>> >
>> > 2102880
GGEVASAIMTIFTAAGTTTERMISSGAFLLLTHPEQLAALRADHSLMDNVLEEILRFHHP
>> > 2102701
>> > 2102700
NQSTSTNRRATQDVELGGKTIRAGDTVRVSLGSANRDAAQFDEPDAFNIQRTGTKHMSFG
>> > 2102521
>> > 2102520 FGIHFCLGSALARYETKAALEALL 2102449
>> >
>> > >57% to CYP153A2 Caulobacter crescentus at C-term
>> > 70% to seawater bacterial sequence JCVI_PEP_1096681995831
>> > 2416172
MTTANQTSPNGAIDVNDIPLAELDVSQPHLFKNDTWRPWFARLRAEAPVHYLADSENG
>> > 2416345
>> > 2416346
PFWSVTSHDMTKAVDANHKVFSSEEGGIAIVDPQPLDGEQLMRDPSFISMDEPKHATQRK
>> > 2416525
>> > 2416526
AVSPAVAPKNLAELEPLIRERAADILDNLPVGETFNWVDRVSVELTARMLATLFDFPYER
>> > 2416705
>> > 2416706
RRDLIRWSDVATAVPKVTGEANDMGARRDALIECATTFYQLWQERAAQPPKFDFVSM
>> > 2416876
>> > 2416877
LAHGEATKHLSEDPLLMLGNIILLIVGGNDTTRNSISGGVVALNQYPEEYQKLRDTPAL
>> > 2417053
>> > 2417054
IPNMVAETVRWQTPVIHMRRTALEDVELGGKTIRKGDKVVMWYLSGNRDEAVFPDADRLI
>> > 2417233
>> > 2417234
IDRPNARQHVSFGFGVHRCMGNRLAEMQLRVLWEEIMKRFHTVEVVGEVERLSNNFI
>> > 2417404
>> > 2417405 RGIASVPVRL 2417434
>> >
>> > >JCVI_PEP_1096681995831
/source_dna_id=JCVI_ORF_1096681995830
>> > /offset=0
>> > /translation_table=11 /length=418
/full_length=418
>> > Length = 418
>> >
>> > Score = 1567 (551.6 bits), Expect = 5.9e-163, P =
5.9e-163
>> > Identities = 290/409 (70%), Positives = 346/409 (84%)
>> >
>> > Query: 13
>> >
IDVNDIPLAELDVSQPHLFKNDTWRPWFARLRAEAPVHYLADSENGPFWSVTSHDMTKAV 72
>> > ID + PL ELDVS P ++NDTWRP FARLR EAPVHYL+DS
NGPFWSVTSH +
>> K
>> > V
>> > Sbjct: 7
>> >
IDNSSGPLRELDVSLPEHYENDTWRPMFARLRKEAPVHYLSDSVNGPFWSVTSHALIKEV 66
>> >
>> > Query: 73
>> >
DANHKVFSSEEGGIAIVDPQPLDGEQLMRDPSFISMDEPKHATQRKAVSPAVAPKNLAEL 132
>> > DAN+ +FSSE+GGI+IVD +P++G+ ++ +FI+MDEP+H+
QR AV+P+VAPKNL
>> > EL
>> > Sbjct: 67
>> >
DANNSIFSSEKGGISIVDLKPVEGQ--VQGKNFIAMDEPEHSIQRSAVAPSVAPKNLVEL 124
>> >
>> > Query: 133
>> >
EPLIRERAADILDNLPVGETFNWVDRVSVELTARMLATLFDFPYERRRDLIRWSDVATAV 192
>> > EPLIRERA DIL+NLPVGETFNWV VS+ELTARML T+
DFPY++R
>> L++WSD+AT
>> > V
>> > Sbjct: 125
>> >
EPLIRERAVDILENLPVGETFNWVQEVSIELTARMLTTILDFPYDQRHKLVQWSDLATDV 184
>> >
>> > Query: 193
>> >
PKVTG-EANDMGARRDALIECATTFYQLWQERAAQPPKFDFVSMLAHGEATKHLSEDPLL 251
>> > P+VTG E DM AR D L+ CA FYQLW ++ QPP FD
+SML + T ++ED
>> > L
>> > Sbjct: 185
>> >
PQVTGKEGTDMQARYDELMNCAAAFYQLWVSKSGQPPSFDLISMLQNNPDTARMNEDMEL 244
>> >
>> > Query: 252
>> >
MLGNIILLIVGGNDTTRNSISGGVVALNQYPEEYQKLRDTPALIPNMVAETVRWQTPVIH 311
>> > LGN++LLIVGGNDTTRNSISGGV+ALNQYP+EYQKLRD
PALIPNMV+E
>> > +RWQTPVIH
>> > Sbjct: 245
>> >
FLGNMLLLIVGGNDTTRNSISGGVMALNQYPDEYQKLRDNPALIPNMVSEIIRWQTPVIH 304
>> >
>> > Query: 312
>> >
MRRTALEDVELGGKTIRKGDKVVMWYLSGNRDEAVFPDADRLIIDRPNARQHVSFGFGVH 371
>> > MRRTALED ELGG+ I+KG+KV+MWYLSGNRDE+VF D
DRLIIDRPNAR
>> > HV+FGFGVH
>> > Sbjct: 305
>> >
MRRTALEDYELGGQHIKKGEKVIMWYLSGNRDESVFEDPDRLIIDRPNARSHVAFGFGVH 364
>> >
>> > Query: 372
RCMGNRLAEMQLRVLWEEIMKRFHTVEVVGEVERLSNNFIRGIASVPVRL 421
>> > RCMGNR+AE+QLRVLWEEIM+RFHT+EVVG++ RL
NNFIRGI VPVR+
>> > Sbjct: 365
RCMGNRMAELQLRVLWEEIMERFHTIEVVGDITRLPNNFIRGIKEVPVRV 414
>> >
>> > >58% to CYP153B4 Rhodopseudomonas palustris
NZ_AAAF01000001 gene =
>> > Rpal2887
>> > 2455381
FPIFEKMRAEEPVHYCAESTYGPYWSVTRYEDIMAVDTNHQVYSSEADFGGIVID
>> 2455217
>> > 2455216
DRIAIDPETNYKSASFISMDQPKHDDQRKSVNGITNPNNLQYFGDIIRTRTVNMLDSLPV
>> > 2455037
>> > 2455036
GEEFDWVPTVSIELTTQMLATLFDFPFEDRHKLTRWSDVITAEPESDIVENQE 2454878
>> > 2454877
ARVAELNEMAEYFVELQKGRINKPDSIDLLTMMTHSPAMAKMPPEEFMGNLALLIVGGN
>> > 2454701
>> > 2454700
DTTRNSMSGSIFGMHLFPDEFKKMVDDPSLTDNAVAEIIRWQTPLSHMRRTALQDAVL
>> > 2454527
>> > 2454526
GGKQIRKGDKVVMWYASGNRDTSIFDDPDKIIIDRKNARRHLSFGFGIHRCMGNRIGELQ
>> > 2454347
>> > 2454346 LRILWEEILKRFSRVEVTGEPVLTHSNFVKGYASLPVKL 2454230
>> >
>> > >59% to CYP153A2 Caulobacter crescentus CB15 GenPept
AAK22050
>> > 2456653
FPIFEQMRQEDPVHYCAESTYGPYWSVTRYEDIMAVDTNHHVYSSDAHLGGIIID
>> 2456489
>> > 2456488
DGIQNDPENDFKAVNFIAMDKPKHDEQRKSVNGITNPNNLQHFGEIIRKRTSNLLDSLPV
>> > 2456309
>> > 2456308
GEEFDWVSTVSIELTTQMLATMFDFPFEDRHKLTRWSDVSTAEPGSGIVETQQQRI
>> 2456141
>> > 2456140
DELMEMAAYFSDLQQSRKDKPDNIDLLTMMTHSPAMANMPPEEFLGNLSLLIVGGNDTT
>> > 2455964
>> > 2455963
RNSMTGGVFGFSLFPEQWDKMVADPTLIDNAVAEIIRWQTPLAHMRRTALEDAILGG
>> > 2455793
>> > 2455792
KQIRKGDKVVMWYASGNRDTSIFDDPDKIIIDRKNARRHLSFGFGIHRCMGNRIGELQLR
>> > 2455613
>> > 2455612 ILWEEVLKRFSRIEVTGEPELTNSNFVKGYTSLPVKL 2455502
>> >
>> > >68% to CYP153A2 Caulobacter crescentus CB15 GenPept
AAK22050
>> > 2728456
PYFERLRKEAPVHKAYSPDFGEYWSVTRYEDIMAVDTNHHVFSSSWEHGGITLFDQISDF
>> > 2728635
>> > 2728636
QLPMFIAMDPPKHDQQRITVQPIVAPNNLKNWEGLIRERTGQILDSLPRGEVFDWVDNVS
>> > 2728815
>> > 2728816
VELTTMMLATLFDFPFEQRRKLTRWSDVATGRNNPEIVADDDQWRAELLECLEAFT
>> 2728983
>> > 2728984
DIWNERINSDTPGNDLITMLTRGESTKNMDPMEYLGNIILLIVGGNDTTRNSMTASVYA
>> > 2729160
>> > 2729161
LNKFAGEYDKLLAKPDLIPNLSSEIIRWQTPLAHMRRTALEDIVLNGAHIKKGDKVAMWY
>> > 2729340
>> > 2729341
VSGNRDESVFEDADKVIIDRPNARRQMSFGYGIHRCVGNRLGELQIKILWEEILKRFPKI
>> > 2729520
>> > 2729521 EVMEEPTRTKSVFVKGYTYMPVRI 2729592
>> >
>> > >71% to CYP153A2 Caulobacter crescentus CB15 GenPept
AAK22050
>> > 2729740
MPLEDINVADGALFQDDAIWPYFERLRKEAPVHKGHSDEFGDYWSVTRYEDIMAVDTNHH
>> > 2729919
>> > 2729920
VFSSEGAITLADPLEDFRAPMFIAMDPPKHDKQRITVQPIVAPKNLQNWEGLIRERTGLI
>> > 2730099
>> > 2730100
LDQLPRNETFDWVDKVSIELTTMMLATLFDFPFEERRRLTRWSDVATGRDNPEIYK
>> 2730267
>> > 2730268
SEEQWRGELMECLEAFTGLWNDRVNSDTPGNDLISMLASGESTKNMDPMEYLGNIILLI
>> > 2730444
>> > 2730445
VGGNDTTRNSMTGSVYALNKFAGEYDKLIADPSLIPNLSSEIIRWQTPLAHMRRTALEDI
>> > 2730624
>> > 2730625
ELNGQMIKKGDKVAMWYVSGNRDTAVFENADDVIIDRPNARRQMSFGYGIHRCVGNRLGE
>> > 2730804
>> > 2730805 LQIKILWEELLKRFPKIEVMEEPTRTRSPFVKGYTYMPVRI 2730927
>> >
>> > >68% to CYP153A2 Caulobacter crescentus CB15 GenPept
AAK22050
>> > 2731076
MPLNEINPARRDLFQNDVIWPYFERLRKEAPVHKAYDEDFGEYWSVTRYEDIMAVDTNHH
>> > 2731255
>> > 2731256
VFSSDWTNGGITLFDAAEDFRLPMFIAMDPPKHDQQRITVQPIVAPNNLKNWEGLIRERT
>> > 2731435
>> > 2731436
AYVLDSLPRGETFDWVDNVSIELTTMMLATLFDFPFEERRKLTFWSDMVTTDPKT
>> 2731600
>> > 2731601
LEGGVEEKRGHLLACLEYFTGLWNERINSDTPGNDLITMLTRGESTKNMDPMEYLGNI
>> > 2731774
>> > 2731775
ILLIVGGNDTTRNSMTASVYGLNKFPGEYDKLIADPSLIPNLSSEIIRWQTPLAHMRRTA
>> > 2731954
>> > 2731955
LEDIELNGTMIKKGDKVAMWYVSGNRDADVFENADDIIIDRPNARRQMSFGYGIHRCVGN
>> > 2732134
>> > 2732135 RLGELQIKILWEEILKRFPKIELMEEPTRTPGCFVKGYTYMPVRI
2732269
>> >
>> > >68% to CYP153A2 Caulobacter crescentus CB15 GenPept
AAK22050
>> > 2732420
MPLNEFNPAQRDLFQNDVIWPYFERLREEAPVHKCFDEEFGEYWSVSSYEHIMAVDTNHQ
>> > 2732599
>> > 2732600
VFSSSWEHGGITLFDGPEDFQLPMFIAMDQPKHDEQRKTVQPIVAPNNLKSWEPLIRERT
>> > 2732779
>> > 2732780
GMVLDSLPRGETFDWVDNVSIELTTMMLATLFDFPFEDRRKLTFWSDMVTTDAN 2732941
>> > 2732942
TLEGGEEEWKGHLLECLAYFTELWNQRINSDKPGNDLITMLTRGEATKNMDPMEYLGN
>> > 2733115
>> > 2733116
IILLIVGGNDTTRNSMTGSVYALNKFPTEYDKLIADPGLIPNLSSEIIRWQTPLAHMRRT
>> > 2733295
>> > 2733296
ALEDFELGGKMIKKGDKVAMWYVSGNRDKTVFENADDVIIDRANARRQMSFGYGIHRCVG
>> > 2733475
>> > 2733476 NRLGELQIMILWEEILKRFPKIELMAEPTRSPGCFVKGYTYMPVRI
2733613
>> >
>> > >33% to CYP107L1 AF087022 Streptomyces venezuelae
cytochrome P. =
>> pikC
>> > gene
>> > AF079139, 50% to seawater bacterial sequence
JCVI_PEP_1096680833653
>> > 302400
PYPLYTRLRPHAPVQGYRDYPPGTVPGEDEPVNAWVLLDYDQVSKAARDHRTFSSRDPLQ
>> > 302221
>> > 302220
EGSSAPTLMLVNHDNPEHDRLRNIVNLAFSRKRIEELSPYVSKMVHTLLDEVESASGGKV
>> > 302041
>> > 302040
EAMSDICAALPARVMVHLLGLPNEIAAKFRHWGTAFMLSADLTPEERQTSNV 301885
>> > 301884
ELYTYFVEQVTAMDEALAAGKDVPDSLMRALLTAEADGEKLTRDEVIRFCLTLVVAGAE
>> > 301708
>> > 301707
TTTFLLGNLLHHLATMPEMTERLRANRDDIEGFMNESLRHSGPPQRLFRIAEADVEVGGQ
>> > 301528
>> > 301527
QIRKGDWVALFFAAANHDPAMFPDPEKFDIDRTNLNKQLTFGVGVHHCLGSALAKAEARE
>> > 301348
>> > 301347 LMNALL 301330
>> >
>> > >JCVI_PEP_1096680833653
/source_dna_id=JCVI_ORF_1096680833652
>> > /offset=0
>> > /translation_table=11 /length=250
/full_length=250
>> > Length = 250
>> >
>> > Score = 586 ( 206.3 bits), Expect = 5.3e-59, P = 5.3e-59
>> > Identities = 118/234 (50%), Positives = 160/234 (68%)
>> >
>> > Query: 1
>> >
AFPYPLYTRLRPHAPVQGYRDYPPGTVPGEDEPVNAWVLLDYDQVSKAARDHRTFSSRDP 60
>> > A PYP Y +LR +PV GY D PPGTVPG+DEP +W +L
+ V +AARD +TFSS
>> > DP
>> > Sbjct: 20
>> >
ANPYPYYDQLRAASPVHGYVDLPPGTVPGQDEPKISWAVLRHADVVEAARDAQTFSSADP 79
>> >
>> > Query: 61
>> >
LQEGSSAPTLMLVNHDNPEHDRLRNIVNLAFSRKRIEELSPYVSKMVHTLLDEVESASGG 120
>> > LQ S+APTLMLVN D P H +LR I + AF+ +RI
E P+V+++ +L +
>> > GG
>> > Sbjct: 80
>> >
LQAESTAPTLMLVNDDPPRHSKLRAIAHKAFTPRRILEKGPWVAQVAAEIL----APCGG 135
>> >
>> > Query: 121
>> >
KV-EAMSDICAALPARVMVHLLGLPNEIAAKFRHWGTAFMLSADLTPEERQTSNVELYTY 179
>> > + + M+D+ LP RVM ++G+ + A +FR+W
TAFMLSADLTP R+ SN E+
>> > +
>> > Sbjct: 136
>> >
RCFDFMTDVAPVLPTRVMAKVIGVDDAQAPRFRNWATAFMLSADLTPAAREASNREVAAF 195
>> >
>> > Query: 180
FVEQVTAMDEALAAGKDVPDSLMRALLTAEADGEKLTRDEVIRFCLTLVVAGAET
>> 234
>> > FV+ V + G D PD L+ AL+ ++DG++LTRDEV
RFC+TL+VAGAET
>> > Sbjct: 196
FVDHVNRRYALIERGGDPPDDLVTALILEDSDGQRLTRDEVTRFCITLLVAGAET
>> 250
>> >
>> > >48% to CYP191A1 Caulobacter crescentus CB15 GenPept
AAK22930
>> > 75% to seawater bacterial sequence JCVI_PEP_1096682145269
>> > 3343360
PHEFYKTMRESAPVMWSDIRKGGDGFWSVSRYDDLKAVELAPTVFSSERGSINLGVAPKD
>> > 3343181
>> > 3343180
KWKPEKLVSAALNALINLDAPRHMEMRIQQMDFFAPAYVATLRDKVSAKIDSLLDDMESK
>> > 3343001
>> > 3343000
GPVVDMVPVFSEQLPLFTLCEMLGVDEEDRPKIAHWMHYLELASQYLTNPWQVIIKEPLF
>> > 3342821
>> > 3342820 PFRFFKAVKDMFAYGEAIMADRRANPREDLLTAIAKTKLSDEELPQEFL
3342674
>> > 3342673
DGSWLLIIFAGNDTSRNSLSGTIRLMTEFPDQRQMVLDDPSLIPRMSQEALRMISPVRHM
>> > 3342494
>> > 3342493
RRTAVEDTEINGQRIAKDEKVVLWYGAANRDPSMFPDPDRFDMMRDSVDKHLAFGHGVHK
>> > 3342314
>> > 3342313 CLGSRIAQMQ 3342284
>> >
>> > >JCVI_PEP_1096682145269
/source_dna_id=JCVI_ORF_1096682145268
>> > /offset=0
>> > /translation_table=11 /length=367
/full_length=367
>> > Length = 367
>> >
>> > Score = 1279 (450.2 bits), Expect = 1.9e-132, P =
1.9e-132
>> > Identities = 236/312 (75%), Positives = 273/312 (87%)
>> >
>> > Query: 48
>> >
ERGSINLGVAPKDKWKPEKLVSAALNALINLDAPRHMEMRIQQMDFFAPAYVATLRDKVS 107
>> > +RGSIN+ V + WKPEKL AA N+LINLDAP HMEMR+QQ
+FF PAY+
>> TLRDKV
>> >
>> > Sbjct: 1
>> >
QRGSINMMVGDRKLWKPEKLAPAAFNSLINLDAPAHMEMRMQQSEFFFPAYIETLRDKVE 60
>> >
>> > Query: 108
>> >
AKIDSLLDDMESKGPVVDMVPVFSEQLPLFTLCEMLGVDEEDRPKIAHWMHYLELASQYL 167
>> > AKID++LD++E +GPVVD
+FSE+LPLFTLCEMLG+DEEDRP+I WMH+LELA
>> > Q+L
>> > Sbjct: 61
>> >
AKIDAMLDELERQGPVVDFAKLFSEELPLFTLCEMLGIDEEDRPRIKLWMHHLELAGQFL 120
>> >
>> > Query: 168
>> >
TNPWQVIIKEPLFPFRFFKAVKDMFAYGEAIMADRRANPREDLLTAIAKTKLSDEELPQE 227
>> > NPWQ + EP+FPFRF K V++MFA+GE IM
DRRANPR+DLLT IA++KL E
>> > LPQE
>> > Sbjct: 121
>> >
ANPWQTFLSEPMFPFRFNKVVQEMFAFGERIMKDRRANPRDDLLTVIAQSKLEGELLPQE 180
>> >
>> > Query: 228
>> >
FLDGSWLLIIFAGNDTSRNSLSGTIRLMTEFPDQRQMVLDDPSLIPRMSQEALRMISPVR 287
>> > +LDGSWLLIIFAGNDTSRNSLSGTIRLMTEFP QR
>> +VLDDPSLIP+MS+EALRM+SPV
>> >
>> > Sbjct: 181
>> >
YLDGSWLLIIFAGNDTSRNSLSGTIRLMTEFPTQRTLVLDDPSLIPQMSEEALRMVSPVI 240
>> >
>> > Query: 288
>> >
HMRRTAVEDTEINGQRIAKDEKVVLWYGAANRDPSMFPDPDRFDMMRDSVDKHLAFGHGV 347
>> > HMRRTAVEDTEINGQ IAKDEKVVLWYGAANRDP +FPDPD F++
>> > RD+V+KHLAFGHGV
>> > Sbjct: 241
>> >
HMRRTAVEDTEINGQPIAKDEKVVLWYGAANRDPDIFPDPDTFNLHRDNVEKHLAFGHGV 300
>> >
>> > Query: 348 HKCLGSRIAQMQ 359
>> > HKCLGSRIA+MQ
>> > Sbjct: 301 HKCLGSRIAKMQ 312
>> >
>> >
>> > --
>> > David R. Nelson
>> > Associate Professor
>> > Dept. of Molecular Sciences
>> > 858 Madison Ave. Suite G01
>> > University of Tennessee
>> > Memphis TN 38163
>> > (901) 448-8303 phone
>> > (901) 448-7360 fax
>> >
dnelson@utmem.edu
>> >
>> >
>> > --
>> >
==============================================================
>> > Lieven Sterck Predoctoral
fellow
>> >
>> > Tel:+32 (0)9 3313821 Fax:+32 (0)9
3313809
>> >
>> > VIB Department of Plant Systems Biology, UGent
>> > Bioinformatics and Evolutionary Genomics Division
>> > Technologiepark 927, B-9052 Gent,
Belgium
>> > Email:
>> >
lieven.sterck@psb.ugent.be
>> > Website: http://bioinformatics.psb.ugent.be
>> >
>> >
--------------------------------------------------------------
>> > Algal Genetics Group
>> > UMR 7139 CNRS-UPMC
>> > Végétaux Marins et Biomolécules (Marine Plants and
Biomolecules)
>> > Station Biologique
>> > Place Georges Teissier, BP74
>> >
>> > 29682 Roscoff Cedex, France
>> > Website: http://www.sb-roscoff.fr/UMR7139/en/genetics.html
>> >
>> >
==============================================================
>> >
>> >
>>
>>
>> --
>> David R. Nelson
>> Associate Professor
>> Dept. of Molecular Sciences
>> 858 Madison Ave. Suite G01
>> University of Tennessee
>> Memphis TN 38163
>> (901) 448-8303 phone
>> (901) 448-7360 fax
>> dnelson@utmem.edu
>>
>
>
>
> --
> David R. Nelson
> Associate Professor
> Dept. of Molecular Sciences
> 858 Madison Ave. Suite G01
> University of Tennessee
> Memphis TN 38163
> (901) 448-8303 phone
> (901) 448-7360 fax
> dnelson@utmem.edu
>
--
David R. Nelson
Associate Professor
Dept. of Molecular Sciences
858 Madison Ave. Suite G01
University of Tennessee
Memphis TN 38163
(901) 448-8303 phone
(901) 448-7360 fax
dnelson@utmem.edu
--
David R. Nelson
Associate Professor
Dept. of Molecular Sciences
858 Madison Ave. Suite G01
University of Tennessee
Memphis TN 38163
(901) 448-8303 phone
(901) 448-7360 fax
dnelson@utmem.edu
--
==============================================================
Lieven Sterck Predoctoral fellow
Tel:+32 (0)9 3313821 Fax:+32 (0)9 3313809
VIB Department of Plant Systems Biology, UGent
Bioinformatics and Evolutionary Genomics Division
Technologiepark 927, B-9052 Gent, Belgium
Email:
lieven.sterck@psb.ugent.be
Website: http://bioinformatics.psb.ugent.be
--------------------------------------------------------------
Algal Genetics Group
UMR 7139 CNRS-UPMC
Végétaux Marins et Biomolécules (Marine Plants and Biomolecules)
Station Biologique
Place Georges Teissier, BP74
29682 Roscoff Cedex, France
Website: http://www.sb-roscoff.fr/UMR7139/en/genetics.html
==============================================================