新基因序列生物信息学分析(4)

2019-04-15 19:50

SfcI C'TryA_G 4 380, 388, 424, 1389 SfoI GGC'GCC 2 350, 1181 SmlI C'TyrA_G 1 584 TatI w'GTAC_w 2 42, 507

TspDTI ATGAAnnnnnnnnn_nn' 5 411, 732, 802, 934, 949 TspGWI ACGGAnnnnnnnnn_nn' 1 1288

TspRI _nnCAsTGnn' 3 839, 1064, 1432

Enzymes that do not cut:

_________________________________________________________

AarI, AatII, Acc65I, AclI, AfeI, AflII, AflIII, AgeI, AhdI, AleI, AlwNI, ApaI ApaLI, AscI, AseI, AsiSI, AvaI, AvrII, BaeI, BaeI, BamHI, BanII, BbvCI, BciVI BglII, BlpI, Bme1580I, BmgBI, BmtI, BplI, BpmI, Bpu10I, BsaI, BsaAI, BsaBI, BsiWI BsmI, BspHI, BspMI, BsrFI, BsrGI, BssHII, BssSI, BstAPI, BstBI, BstEII, BstXI BstYI, DraI, DraIII, DrdI, Eco57I, EcoICRI, Eco57MI, EcoNI, EcoO109I, EcoRI, EcoRV FalI, FseI, FspAI, HgaI, HindIII, KpnI, MfeI, MluI, NaeI, NcoI, NdeI, NgoMIV, NheI

NotI, NruI, NspI, PacI, PciI, PflMI, PmeI, PmlI, PpiI, PpiI, PpuMI, PsiI, PspOMI PsrI, PsrI, PstI, RsrII, SacI, SanDI, SapI, SbfI, ScaI, SexAI, SfiI, SgrAI, SmaI SnaBI, SpeI, SphI, SrfI, SspI, StuI, StyI, SwaI, TaqII, TaqII, Tth111I, XbaI, XcmI

XhoI, XmaI, XmnI, ZraI

碱基同源性分析

DQ286392序列的BLASTX分析结果(见图1):

图1 DQ286392序列的BLASTX分析结果

Score E Sequences producing significant alignments: (Bits) Value gi|82659769|gb|ABB88954.1| mannanase [Armillariella tabescens] 768 0.0 gi|7208638|emb|CAB76904.1| CEL4a mannanase [Agaricus bisporus] 532 2e-149 gi|1679597|emb|CAA90423.1| CEL4b mannanase [Agaricus bisporus] 528 3e-148 gi|110627661|gb|ABG79370.1| Man5D [Phanerochaete chrysosporium] 513 1e-143 gi|116508737|gb|EAU91632.1| hypothetical protein CC1G_09314 [... 473 2e-131 gi|110627663|gb|ABG79371.1| Man5C [Phanerochaete chrysosporium] 467 6e-130 gi|119485791|ref|XP_001262238.1| endo-1,4-beta-mannosidase, p... 278 6e-73 gi|121715087|ref|XP_001275153.1| endo-1,4-beta-mannosidase, p... 277 9e-73 gi|70983951|ref|XP_747501.1| endo-1,4-beta-mannosidase [Asper... 272 4e-71 gi|70982592|ref|XP_746824.1| endo-1,4-beta-mannosidase [Asper... 261 7e-68 gi|84621433|gb|ABC59553.1| beta-mannanase [Aspergillus sulphureu 260 2e-67 gi|83775912|dbj|BAE66031.1| unnamed protein product [Aspergillus 258 8e-67 gi|558311|gb|AAA67426.1| mannanase 254 7e-66 gi|119488588|ref|XP_001262744.1| endo-1,4-beta-mannosidase [N... 252 3e-65 gi|115402327|ref|XP_001217240.1| hypothetical protein ATEG_08... 250 2e-64

??(以下省略)

由分析结果可知,DQ286392和其他物种的?-甘露聚糖酶相似性最

高,尤其是与Agaricus bisporus物种的CEL4a和CEL4b的?-甘露聚糖酶的相同性达到64%和63%,相似性均达到76%。 以下是DQ286392分别与CEL4a和CEL4b序列对比:

gi|7208638|emb|CAB76904.1| CEL4a mannanase [Agaricus bisporus] Length=439

Score = 532 bits (1371), Expect = 2e-149

Identities = 284/442 (64%), Positives = 339/442 (76%), Gaps = 7/442 (1%) Frame = +2

Query 23 LAFLSLSTFLCSAFAAVPEWGQCGGIGWTGQTTCVSGTVCAALNDYYSQCVPGtatttaa 202 + F+ L+ + A A VP WGQCGG GWTG+T C SG+ C N++YSQC+PG+ T T Sbjct 5 IRFIILAISISLATADVPVWGQCGGRGWTGETACASGSSCVVQNEWYSQCLPGSTTPTNP 64

Query 203 pttatsttisstsrttatsttasapsstGFVTTSGTEFRLNGAKFTIFGANSYWVGLMGY 382 P T T++ ++ T+ +T GFV SGT F LNG K+T+ G NSYWVGL G Sbjct 65 PPTTTTSQTTAPPTTSHPVST-------GFVKASGTRFTLNGQKYTVVGGNSYWVGLTGL 117

Query 383 STTDMNKAFADIAATGATVVRTWGFNEVTSPNGIYYQSWSGSTPTINTGSTGLQNFDavv 562 ST+ MN+AF+DIA G T VRTWGFNEVTSPNG YYQSWSG+ PTINTG++GL NFD V+ Sbjct 118 STSAMNQAFSDIANAGGTTVRTWGFNEVTSPNGNYYQSWSGARPTINTGASGLLNFDNVI 177

Query 563 aaaaaHGLRLIVAITNNWSDYGGMDVYVNQIVGSGSAHDLFYTDCEVISTYMNYVKTFVS 742 AAA A+G+RLIVA+TNNW+DYGGMDVYVNQ+VG+G HDLFYT+ + + +YV+TFVS Sbjct 178 AAAKANGIRLIVALTNNWADYGGMDVYVNQMVGNGQPHDLFYTNTAIKDAFKSYVRTFVS 237

Query 743 RYVNEPTILGWELANEPRCKgstgttsgsctattitkwaaaisaYIKSIDPNHLVGIGDE 922 RY NEPT++ WELANEPRCKGSTGTTSG+CT TT+T WA +SA+IK+ID NHLV IGDE Sbjct 238 RYANEPTVMAWELANEPRCKGSTGTTSGTCTTTTVTNWAKEMSAFIKTIDSNHLVAIGDE 297

Query 923 GFYNEPSAPTYPYQGSEGIDFDANLAISSIDFGTFHSYPISWGQTTDPQGWGTQWIADHA 1102 GFYN+P APTYPYQGSEG+DF+ANLAISS+DF TFHSYP WGQ D + WGTQWI DHA Sbjct 298 GFYNQPGAPTYPYQGSEGVDFEANLAISSVDFATFHSYPEPWGQGADAKAWGTQWITDHA 357

Query 1103 TSMTAAGKPVILEEFGVTTNQATVYGAWYQEVVSSGLTGALIWQAGSYLSSGATPDDGYA 1282 SM KPVILEEFGVTTNQ Y W+ EV SSGLTG LIWQAGS+LS+G T +DGYA Sbjct 358 ASMKRVNKPVILEEFGVTTNQPDTYAEWFNEVESSGLTGDLIWQAGSHLSTGDTHNDGYA 417

Query 1283 IYPDDPVYSLETSYAVTLKARA 1348 +YPD PVY L S+A +K RA

Sbjct 418 VYPDGPVYPLMKSHASAMKNRA 439

gi|1679597|emb|CAA90423.1| CEL4b mannanase [Agaricus bisporus] Length=439

Score = 528 bits (1360), Expect = 3e-148

Identities = 280/442 (63%), Positives = 336/442 (76%), Gaps = 7/442 (1%) Frame = +2

Query 23 LAFLSLSTFLCSAFAAVPEWGQCGGIGWTGQTTCVSGTVCAALNDYYSQCVPGtatttaa 202 + F+ L+ + A A VP WGQCGG WTG+T C SG+ C N++YSQC+PG+ T T Sbjct 5 IRFIILAISISLATADVPVWGQCGGRDWTGETACASGSSCVVQNEWYSQCLPGSTTPTNP 64

Query 203 pttatsttisstsrttatsttasapsstGFVTTSGTEFRLNGAKFTIFGANSYWVGLMGY 382 P T++ ++ T+ +T GFV SGT F LNG K+T+ G NSYWVGL G Sbjct 65 PPATTTSQTTAPPTTSHPVST-------GFVKASGTRFTLNGQKYTVVGGNSYWVGLTGL 117

Query 383 STTDMNKAFADIAATGATVVRTWGFNEVTSPNGIYYQSWSGSTPTINTGSTGLQNFDavv 562 ST+ MN+AF+DIA G T VRTWGFNEVTSPNG YYQSWSG+ PTINTG++GL NFD V+ Sbjct 118 STSAMNQAFSDIANAGGTTVRTWGFNEVTSPNGNYYQSWSGARPTINTGASGLLNFDNVI 177

Query 563 aaaaaHGLRLIVAITNNWSDYGGMDVYVNQIVGSGSAHDLFYTDCEVISTYMNYVKTFVS 742 AAA A+G+RLIVA+TNNW+DYGGMDVYVNQ+VG+G HDLFYT+ + + +Y + FVS Sbjct 178 AAAKANGIRLIVALTNNWADYGGMDVYVNQMVGNGQPHDLFYTNTAIKDAFKSYGRAFVS 237

Query 743 RYVNEPTILGWELANEPRCKgstgttsgsctattitkwaaaisaYIKSIDPNHLVGIGDE 922 RY NEPT++ WELANEPRCKGSTGTTSG+CT TT+T WA +SA+IK+ID NHLV IGDE Sbjct 238 RYANEPTVMAWELANEPRCKGSTGTTSGTCTTTTVTNWAKEMSAFIKTIDSNHLVAIGDE 297

Query 923 GFYNEPSAPTYPYQGSEGIDFDANLAISSIDFGTFHSYPISWGQTTDPQGWGTQWIADHA 1102 GFYN+P APTYPYQGSEG+DF+ANLAISS+DF TFHSYP WGQ D + WGTQWI DHA Sbjct 298 GFYNQPGAPTYPYQGSEGVDFEANLAISSVDFATFHSYPEPWGQGADAKAWGTQWITDHA 357

Query 1103 TSMTAAGKPVILEEFGVTTNQATVYGAWYQEVVSSGLTGALIWQAGSYLSSGATPDDGYA 1282 SM KPVILEEFGVTTNQ Y W+ E+ SSGLTG LIWQAGS+LS+G TP+DGYA Sbjct 358 ASMKRVNKPVILEEFGVTTNQPDTYAEWFNEIESSGLTGDLIWQAGSHLSTGDTPNDGYA 417

Query 1283 IYPDDPVYSLETSYAVTLKARA 1348 +YPD PVY L S+A +K RA Sbjct 418 VYPDGPVYPLVKSHASAMKNRA 439

开放性阅读框(ORF)分析

用NCBI的ORF Finder对DQ286392序列作开放阅读框分析,结果如图2:

序列DQ286392,14~1351位存在一个长1338bp的开放阅读框,编码为445个氨基酸,起始密码子为ATG,终止密码子为TAG,编码区两侧为13bp的5’非翻译区和100bp的3’非翻译区(1~13bp,1352~1451bp),而且在3’末端的polyA尾上游88bp和38bp处各有一个加尾信号,为AATAAA,进一步表明所获得片段包括全长的mRNA3’非翻译区 。将该蛋白质序列命名为“MAN”。

序列“MAN”的开放阅读框及其编码的氨基酸序列如下: 14 atgcatctgctcgcttttctgtctctgagtacattcctgtgctct


新基因序列生物信息学分析(4).doc 将本文的Word文档下载到电脑 下载失败或者文档不完整,请联系客服人员解决!

下一篇:房地产再融资开闸的思考

相关阅读
本类排行
× 注册会员免费下载(下载后可以自由复制和排版)

马上注册会员

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信: QQ: