SfcI C'TryA_G 4 380, 388, 424, 1389 SfoI GGC'GCC 2 350, 1181 SmlI C'TyrA_G 1 584 TatI w'GTAC_w 2 42, 507
TspDTI ATGAAnnnnnnnnn_nn' 5 411, 732, 802, 934, 949 TspGWI ACGGAnnnnnnnnn_nn' 1 1288
TspRI _nnCAsTGnn' 3 839, 1064, 1432
Enzymes that do not cut:
_________________________________________________________
AarI, AatII, Acc65I, AclI, AfeI, AflII, AflIII, AgeI, AhdI, AleI, AlwNI, ApaI ApaLI, AscI, AseI, AsiSI, AvaI, AvrII, BaeI, BaeI, BamHI, BanII, BbvCI, BciVI BglII, BlpI, Bme1580I, BmgBI, BmtI, BplI, BpmI, Bpu10I, BsaI, BsaAI, BsaBI, BsiWI BsmI, BspHI, BspMI, BsrFI, BsrGI, BssHII, BssSI, BstAPI, BstBI, BstEII, BstXI BstYI, DraI, DraIII, DrdI, Eco57I, EcoICRI, Eco57MI, EcoNI, EcoO109I, EcoRI, EcoRV FalI, FseI, FspAI, HgaI, HindIII, KpnI, MfeI, MluI, NaeI, NcoI, NdeI, NgoMIV, NheI
NotI, NruI, NspI, PacI, PciI, PflMI, PmeI, PmlI, PpiI, PpiI, PpuMI, PsiI, PspOMI PsrI, PsrI, PstI, RsrII, SacI, SanDI, SapI, SbfI, ScaI, SexAI, SfiI, SgrAI, SmaI SnaBI, SpeI, SphI, SrfI, SspI, StuI, StyI, SwaI, TaqII, TaqII, Tth111I, XbaI, XcmI
XhoI, XmaI, XmnI, ZraI
碱基同源性分析
DQ286392序列的BLASTX分析结果(见图1):
图1 DQ286392序列的BLASTX分析结果
Score E Sequences producing significant alignments: (Bits) Value gi|82659769|gb|ABB88954.1| mannanase [Armillariella tabescens] 768 0.0 gi|7208638|emb|CAB76904.1| CEL4a mannanase [Agaricus bisporus] 532 2e-149 gi|1679597|emb|CAA90423.1| CEL4b mannanase [Agaricus bisporus] 528 3e-148 gi|110627661|gb|ABG79370.1| Man5D [Phanerochaete chrysosporium] 513 1e-143 gi|116508737|gb|EAU91632.1| hypothetical protein CC1G_09314 [... 473 2e-131 gi|110627663|gb|ABG79371.1| Man5C [Phanerochaete chrysosporium] 467 6e-130 gi|119485791|ref|XP_001262238.1| endo-1,4-beta-mannosidase, p... 278 6e-73 gi|121715087|ref|XP_001275153.1| endo-1,4-beta-mannosidase, p... 277 9e-73 gi|70983951|ref|XP_747501.1| endo-1,4-beta-mannosidase [Asper... 272 4e-71 gi|70982592|ref|XP_746824.1| endo-1,4-beta-mannosidase [Asper... 261 7e-68 gi|84621433|gb|ABC59553.1| beta-mannanase [Aspergillus sulphureu 260 2e-67 gi|83775912|dbj|BAE66031.1| unnamed protein product [Aspergillus 258 8e-67 gi|558311|gb|AAA67426.1| mannanase 254 7e-66 gi|119488588|ref|XP_001262744.1| endo-1,4-beta-mannosidase [N... 252 3e-65 gi|115402327|ref|XP_001217240.1| hypothetical protein ATEG_08... 250 2e-64
??(以下省略)
由分析结果可知,DQ286392和其他物种的?-甘露聚糖酶相似性最
高,尤其是与Agaricus bisporus物种的CEL4a和CEL4b的?-甘露聚糖酶的相同性达到64%和63%,相似性均达到76%。 以下是DQ286392分别与CEL4a和CEL4b序列对比:
gi|7208638|emb|CAB76904.1| CEL4a mannanase [Agaricus bisporus] Length=439
Score = 532 bits (1371), Expect = 2e-149
Identities = 284/442 (64%), Positives = 339/442 (76%), Gaps = 7/442 (1%) Frame = +2
Query 23 LAFLSLSTFLCSAFAAVPEWGQCGGIGWTGQTTCVSGTVCAALNDYYSQCVPGtatttaa 202 + F+ L+ + A A VP WGQCGG GWTG+T C SG+ C N++YSQC+PG+ T T Sbjct 5 IRFIILAISISLATADVPVWGQCGGRGWTGETACASGSSCVVQNEWYSQCLPGSTTPTNP 64
Query 203 pttatsttisstsrttatsttasapsstGFVTTSGTEFRLNGAKFTIFGANSYWVGLMGY 382 P T T++ ++ T+ +T GFV SGT F LNG K+T+ G NSYWVGL G Sbjct 65 PPTTTTSQTTAPPTTSHPVST-------GFVKASGTRFTLNGQKYTVVGGNSYWVGLTGL 117
Query 383 STTDMNKAFADIAATGATVVRTWGFNEVTSPNGIYYQSWSGSTPTINTGSTGLQNFDavv 562 ST+ MN+AF+DIA G T VRTWGFNEVTSPNG YYQSWSG+ PTINTG++GL NFD V+ Sbjct 118 STSAMNQAFSDIANAGGTTVRTWGFNEVTSPNGNYYQSWSGARPTINTGASGLLNFDNVI 177
Query 563 aaaaaHGLRLIVAITNNWSDYGGMDVYVNQIVGSGSAHDLFYTDCEVISTYMNYVKTFVS 742 AAA A+G+RLIVA+TNNW+DYGGMDVYVNQ+VG+G HDLFYT+ + + +YV+TFVS Sbjct 178 AAAKANGIRLIVALTNNWADYGGMDVYVNQMVGNGQPHDLFYTNTAIKDAFKSYVRTFVS 237
Query 743 RYVNEPTILGWELANEPRCKgstgttsgsctattitkwaaaisaYIKSIDPNHLVGIGDE 922 RY NEPT++ WELANEPRCKGSTGTTSG+CT TT+T WA +SA+IK+ID NHLV IGDE Sbjct 238 RYANEPTVMAWELANEPRCKGSTGTTSGTCTTTTVTNWAKEMSAFIKTIDSNHLVAIGDE 297
Query 923 GFYNEPSAPTYPYQGSEGIDFDANLAISSIDFGTFHSYPISWGQTTDPQGWGTQWIADHA 1102 GFYN+P APTYPYQGSEG+DF+ANLAISS+DF TFHSYP WGQ D + WGTQWI DHA Sbjct 298 GFYNQPGAPTYPYQGSEGVDFEANLAISSVDFATFHSYPEPWGQGADAKAWGTQWITDHA 357
Query 1103 TSMTAAGKPVILEEFGVTTNQATVYGAWYQEVVSSGLTGALIWQAGSYLSSGATPDDGYA 1282 SM KPVILEEFGVTTNQ Y W+ EV SSGLTG LIWQAGS+LS+G T +DGYA Sbjct 358 ASMKRVNKPVILEEFGVTTNQPDTYAEWFNEVESSGLTGDLIWQAGSHLSTGDTHNDGYA 417
Query 1283 IYPDDPVYSLETSYAVTLKARA 1348 +YPD PVY L S+A +K RA
Sbjct 418 VYPDGPVYPLMKSHASAMKNRA 439
gi|1679597|emb|CAA90423.1| CEL4b mannanase [Agaricus bisporus] Length=439
Score = 528 bits (1360), Expect = 3e-148
Identities = 280/442 (63%), Positives = 336/442 (76%), Gaps = 7/442 (1%) Frame = +2
Query 23 LAFLSLSTFLCSAFAAVPEWGQCGGIGWTGQTTCVSGTVCAALNDYYSQCVPGtatttaa 202 + F+ L+ + A A VP WGQCGG WTG+T C SG+ C N++YSQC+PG+ T T Sbjct 5 IRFIILAISISLATADVPVWGQCGGRDWTGETACASGSSCVVQNEWYSQCLPGSTTPTNP 64
Query 203 pttatsttisstsrttatsttasapsstGFVTTSGTEFRLNGAKFTIFGANSYWVGLMGY 382 P T++ ++ T+ +T GFV SGT F LNG K+T+ G NSYWVGL G Sbjct 65 PPATTTSQTTAPPTTSHPVST-------GFVKASGTRFTLNGQKYTVVGGNSYWVGLTGL 117
Query 383 STTDMNKAFADIAATGATVVRTWGFNEVTSPNGIYYQSWSGSTPTINTGSTGLQNFDavv 562 ST+ MN+AF+DIA G T VRTWGFNEVTSPNG YYQSWSG+ PTINTG++GL NFD V+ Sbjct 118 STSAMNQAFSDIANAGGTTVRTWGFNEVTSPNGNYYQSWSGARPTINTGASGLLNFDNVI 177
Query 563 aaaaaHGLRLIVAITNNWSDYGGMDVYVNQIVGSGSAHDLFYTDCEVISTYMNYVKTFVS 742 AAA A+G+RLIVA+TNNW+DYGGMDVYVNQ+VG+G HDLFYT+ + + +Y + FVS Sbjct 178 AAAKANGIRLIVALTNNWADYGGMDVYVNQMVGNGQPHDLFYTNTAIKDAFKSYGRAFVS 237
Query 743 RYVNEPTILGWELANEPRCKgstgttsgsctattitkwaaaisaYIKSIDPNHLVGIGDE 922 RY NEPT++ WELANEPRCKGSTGTTSG+CT TT+T WA +SA+IK+ID NHLV IGDE Sbjct 238 RYANEPTVMAWELANEPRCKGSTGTTSGTCTTTTVTNWAKEMSAFIKTIDSNHLVAIGDE 297
Query 923 GFYNEPSAPTYPYQGSEGIDFDANLAISSIDFGTFHSYPISWGQTTDPQGWGTQWIADHA 1102 GFYN+P APTYPYQGSEG+DF+ANLAISS+DF TFHSYP WGQ D + WGTQWI DHA Sbjct 298 GFYNQPGAPTYPYQGSEGVDFEANLAISSVDFATFHSYPEPWGQGADAKAWGTQWITDHA 357
Query 1103 TSMTAAGKPVILEEFGVTTNQATVYGAWYQEVVSSGLTGALIWQAGSYLSSGATPDDGYA 1282 SM KPVILEEFGVTTNQ Y W+ E+ SSGLTG LIWQAGS+LS+G TP+DGYA Sbjct 358 ASMKRVNKPVILEEFGVTTNQPDTYAEWFNEIESSGLTGDLIWQAGSHLSTGDTPNDGYA 417
Query 1283 IYPDDPVYSLETSYAVTLKARA 1348 +YPD PVY L S+A +K RA Sbjct 418 VYPDGPVYPLVKSHASAMKNRA 439
开放性阅读框(ORF)分析
用NCBI的ORF Finder对DQ286392序列作开放阅读框分析,结果如图2:
序列DQ286392,14~1351位存在一个长1338bp的开放阅读框,编码为445个氨基酸,起始密码子为ATG,终止密码子为TAG,编码区两侧为13bp的5’非翻译区和100bp的3’非翻译区(1~13bp,1352~1451bp),而且在3’末端的polyA尾上游88bp和38bp处各有一个加尾信号,为AATAAA,进一步表明所获得片段包括全长的mRNA3’非翻译区 。将该蛋白质序列命名为“MAN”。
序列“MAN”的开放阅读框及其编码的氨基酸序列如下: 14 atgcatctgctcgcttttctgtctctgagtacattcctgtgctct