1 GENSCAN 1.0 Date run: 1-Aug-100 Time: 16:43:38
3 Sequence HSBA536C5 : 168628 bp : 49.21% C+G : Isochore 2 (43 - 51 C+G%)
5 Parameter matrix: HumanIso.smat
9 Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr..
10 ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------
12 2.04 PlyA - 7901 7896 6 1.05
13 2.03 Term - 10642 10463 180 1 0 28 43 120 0.957 -0.89
14 2.02 Intr - 11044 10815 230 2 2 84 44 310 0.981 23.79
15 2.01 Init - 14499 13650 850 0 1 126 53 2079 0.818 202.23
16 2.00 Prom - 16112 16073 40 -5.56
18 3.00 Prom + 18327 18366 40 -5.06
19 3.01 Init + 18680 18726 47 1 2 84 105 30 0.585 4.46
20 3.02 Intr + 23250 23284 35 0 2 151 69 35 0.533 5.77
21 3.03 Term + 26615 26664 50 0 2 108 43 36 0.267 -1.43
22 3.04 PlyA + 27305 27310 6 1.05
24 8.32 PlyA - 114694 114689 6 1.05
25 8.31 Term - 117609 117581 29 1 2 139 37 35 0.986 1.74
26 8.30 Intr - 118004 117913 92 1 2 126 77 101 0.988 12.44
27 8.29 Intr - 121211 121110 102 1 0 85 89 95 0.997 8.59
28 8.28 Intr - 121457 121327 131 2 2 130 51 125 0.999 12.49
29 8.27 Intr - 125623 125478 146 2 2 108 92 121 0.958 14.50
30 8.26 Intr - 126663 126540 124 0 1 113 58 151 0.981 14.76
31 8.25 Intr - 127050 126896 155 1 2 72 91 196 0.685 18.09
32 8.24 Intr - 128563 128395 169 1 1 91 72 343 0.999 32.52
33 8.23 Intr - 129031 128881 151 0 1 68 95 202 0.996 19.06
34 8.22 Intr - 129561 129425 137 0 2 113 94 171 0.999 19.57
35 8.21 Intr - 131557 131385 173 2 2 121 94 69 0.957 10.46
36 8.20 Intr - 131891 131702 190 2 1 126 66 153 0.780 16.06
37 8.19 Intr - 135872 135738 135 2 0 37 92 171 0.802 13.16
38 8.18 Intr - 136182 136073 110 1 2 139 33 122 0.867 11.80
39 8.17 Intr - 136622 136424 199 2 1 96 22 400 0.999 33.12
40 8.16 Intr - 138994 138726 269 2 2 89 74 152 0.257 11.15
41 8.15 Intr - 143743 143626 118 1 1 100 63 113 0.289 10.04
42 8.14 Intr - 144150 144016 135 0 0 43 100 129 0.999 10.36
43 8.13 Intr - 147107 146994 114 2 0 102 91 154 0.995 17.74
44 8.12 Intr - 148107 147904 204 0 0 104 92 97 0.839 11.10
45 8.11 Intr - 149987 149928 60 2 0 114 113 90 0.999 13.03
46 8.10 Intr - 151157 150965 193 1 1 75 77 125 0.355 9.59
47 8.09 Intr - 161359 161278 82 2 1 105 95 51 0.520 6.20
48 8.08 Intr - 163259 163168 92 1 2 117 91 174 0.980 20.24
49 8.07 Intr - 163512 163411 102 2 0 141 89 85 0.999 13.19
50 8.06 Intr - 166251 166121 131 0 2 113 81 212 0.999 22.49
51 8.05 Intr - 166582 166437 146 2 2 111 92 215 0.999 24.20
52 8.04 Intr - 166905 166782 124 0 1 107 70 221 0.999 22.36
53 8.03 Intr - 167313 167159 155 1 2 116 89 268 0.999 29.49
54 8.02 Intr - 167718 167550 169 0 1 96 72 360 0.999 34.72
55 8.01 Intr - 168007 167857 151 0 1 75 99 227 0.984 22.66
57 Predicted peptide sequence(s):
59 Predicted coding sequence(s):
62 >HSBA536C5|GENSCAN_predicted_peptide_2|419_aa
63 MAQENAAFSPGQEEPPRRRGRQRYVEKDGRCNVQQGNVRETYRYLTDLFTTLVDLQWRLS
64 LLFFVLAYALTWLFFGAIWWLIAYGRGDLEHLEDTAWTPCVNNLNGFVAAFLFSIETETT
65 IGYGHRVITDQCPEGIVLLLLQAILGSMVNAFMVGCMFVKISQPNKRAATLVFSSHAVVS
66 LRDGRLCLMFRVGDLRSSHIVEASIRAKLIRSRQTLEGEFIPLHQTDLSVGFDTGDDRLF
67 LVSPLVISHEIDAASPFWEASRRALERDDFEIVVILEGMVEATGMTCQARSSYLVDEGLW
68 GHRFTSVLTLEDGFYEVDYASFHETFEVPTPSCSARELAEAAARLDAHLYWSIPSRLDEK
69 RVSPRCDQLPPDPCGRPGARHRYMGNCISEVVEEEEEEEGKAPGNVLKLESPRPPEPQV
71 >HSBA536C5|GENSCAN_predicted_CDS_2|1260_bp
72 atggcgcaggagaacgcggccttctcgcccgggcaggaggagccgccgcggcgccgcggc
73 cgccagcgctacgtggagaaggatggccggtgcaacgtgcagcagggcaacgtgcgcgag
74 acataccgctacctgacggacctgttcaccacgctggtggacctgcagtggcgcctcagc
75 ctgttgttcttcgtcctggcctacgcgctcacctggctcttcttcggcgccatctggtgg
76 ctgatcgcctacggccgcggcgacctggagcacctggaggacaccgcgtggacgccgtgc
77 gtcaacaacctcaacggcttcgtggccgccttcctcttctccatcgagaccgagaccacc
78 atcggctacgggcaccgcgtcatcaccgaccagtgccccgagggcatcgtgctgctgctg
79 ctgcaggccatcctgggctccatggtgaacgccttcatggtgggctgcatgttcgtcaag
80 atctcgcagcccaacaagcgcgcagccacgctcgtcttctcctcgcacgccgtggtgtcg
81 ctgcgcgacgggcgcctctgcctcatgttccgcgtgggcgacttgcgctcctcacacata
82 gtggaggcctccatccgcgccaagctcatccgctcgcgccagacgctggagggcgagttc
83 atcccgctgcaccagaccgacctcagcgtgggcttcgacacgggagacgaccgcctcttc
84 ctcgtctcgccgctggttatcagccacgagatcgacgccgccagccccttctgggaggcg
85 tcgcgccgtgccctcgagagggacgacttcgagatcgtcgttatcctcgagggcatggtg
86 gaagccacgggaatgacatgccaagctcggagctcctacctggtagacgaggggctgtgg
87 ggccaccgcttcacgtcagtgctgactctggaggacggcttctacgaagtggactatgcc
88 agctttcacgagacttttgaggtgcccacaccttcgtgcagtgctcgagagctggcagag
89 gctgccgcccgccttgatgcccatctctactggtccatccccagccggctggatgagaag
90 agagtgagtccaaggtgtgaccagcttcctccagacccctgtggcagaccgggggccaga
91 cacagatacatggggaactgcatatcggaggtggtggaggaggaggaggaggaggaaggc
92 aaagcccctggaaatgtgctaaagttggaaagtccccgtcccccagaacctcaagtctag
94 >HSBA536C5|GENSCAN_predicted_peptide_3|43_aa
95 MNTAAINIHRQIFMWTSSVVKTSFTVTFSSPGVIPPRLPYARE
97 >HSBA536C5|GENSCAN_predicted_CDS_3|132_bp
98 atgaatacagctgctataaacatccatcggcagattttcatgtggacgtcttctgtggtg
99 aagacctccttcactgtgaccttctcctcaccaggtgtgatcccccccaggctcccctat
102 >HSBA536C5|GENSCAN_predicted_peptide_8|1429_aa
103 XEAKACVVHGSDLKDMTSEQLDEILKNHTEIVFARTSPQQKLIIVEGCQRQGAIVAVTGD
104 GVNDSPALKKADIGIAMGISGSDVSKQAADMILLDDNFASIVTGVEEGRLIFDNLKKSIA
105 YTLTSNIPEITPFLLFIIANIPLPLGTVTILCIDLGTDMVPAISLAYEAAESDIMKRQPR
106 NSQTDKLVNERLISMAYGQIGMIQALGGFFTYFVILAENGFLPSRLLGIRLDWDDRTMND
107 LEDSYGQEWTYEQRKVVEFTCHTAFFASIVVVQWADLIICKTRRNSVFQQGMKNKILIFG
108 LLEETALAAFLSYCPGMGVALRMYPLKVTWWFCAFPYSLLIFIYDEVRKLILRRYPGDLA
109 ITKGSSGECKSLRLEKVDLSPSRGCFLPTVELGQLFLGIAMGLWGKKGTVAPHDQSPRRR
110 PKKGLIKKKMVKREKQKRNMEELKKEVVMDDHKLTLEELSTKYSVDLTKGHSHQRAKEIL
111 TRGGPNTVTPPPTTPEWVKFCKQLFGGFSLLLWTGAILCFVAYSIQIYFNEEPTKDNLYL
112 SIVLSVVVIVTGCFSYYQEAKSSKIMESFKNMVPQQALVIRGGEKMQINVQEVVLGDLVE
113 IKGGDRVPADLRLISAQGCKVDNSSLTGESEPQSRSPDFTHENPLETRNICFFSTNCVEG
114 TARGIVIATGDSTVMGRIASLTSGLAVGQTPIAAEIEHFIHLITVVAVFLGVTFFALSLL
115 LGYGWLEAIIFLIGIIVANVPEGLLATVTVCLTLTAKRMARKNCLVKNLEAVETLGSTST
116 ICSDKTGTLTQNRMTVAHMWFDMTVYEADTTEEQTGKTFTKSSDTWFMLARIAGLCNRAD
117 FKANQEILPIAKRATTGDASESALLKFIEQSYSSVAEMREKNPKVAEIPFNSTNKYQMSI
118 HLREDSSQTHVLMMKGAPERILEFCSTFLLNGQEYSMNDEMKEAFQNAYLELGGLGERVL
119 GFCFLNLPSSFSKGFPFNTDEINFPMDNLCFVGLISMIDPPRAAVPDAVSKCRSAGIKVI
120 MVTGDHPITAKAIAKGVGIISEGTETAEEVAARLKIPISKVDASAAKAIVVHGAELKDIQ
121 SKQLDQILQNHPEIVFARTSPQQKLIIVEGCQRLGAVVAVTGDGVNDSPALKKADIGIAM
122 GISGSDVSKQAADMILLDDNFASIVTGVEEGRLIFDNLKKSIMYTLTSNIPEITPFLMFI
123 ILGIPLPLGTITILCIDLGTDMVPAISLAYESAESDIMKRLPRNPKTDNLVNHRLIGMAY
124 GQIGMIQALAGFFTYFVILAENGFRPVDLLGIRLHWEDKYLNDLEDSYGQQWTYEQRKVV
125 EFTCQTAFFVTIVVVQWADLIISKTRRNSLFQQGMRNKVLIFGILEETLLAAFLSYTPGM
126 DVALRMYPLKITWWLCAIPYSILIFVYDEIRKLLIRQHPDGWVERETYY
128 >HSBA536C5|GENSCAN_predicted_CDS_8|4290_bp
129 nnagaagccaaggcatgcgtggtgcacggctctgacctgaaggacatgacatcggagcag
130 ctcgatgagatcctcaagaaccacacagagatcgtctttgctcgaacgtctccccagcag
131 aagctcatcattgtggagggatgtcagaggcagggagccattgtggccgtgacgggtgac
132 ggggtgaacgactcccctgcattgaagaaggctgacattggcattgccatgggcatctct
133 ggctctgacgtctctaagcaggcagccgacatgatcctgctggatgacaactttgcctcc
134 atcgtcacgggggtggaggagggccgcctgatctttgacaacttgaagaaatccatcgcc
135 tacaccctgaccagcaacatccccgagatcacccccttcctgctgttcatcattgccaac
136 atccccctacctctgggcactgtgaccatcctttgcattgacctgggcacagatatggtc
137 cctgccatctccttggcctatgaggcagctgagagtgatatcatgaagcggcagccacga
138 aactcccagacggacaagctggtgaatgagaggctcatcagcatggcctacggacagatc
139 gggatgatccaggcactgggtggcttcttcacctactttgtgatcctggcagagaacggt
140 ttcctgccatcacggctactgggaatccgcctcgactgggatgaccggaccatgaatgat
141 ctggaggacagctatggacaggagtggacctatgagcagcggaaggtggtggagttcacg
142 tgccacacggcattctttgccagcatcgtggtggtgcagtgggctgacctcatcatctgc
143 aagacccgccgcaactcagtcttccagcagggcatgaagaacaagatcctgatttttggg
144 ctcctggaggagacggcgttggctgcctttctctcttactgcccaggcatgggtgtagcc
145 ctccgcatgtacccgctcaaagtcacctggtggttctgcgccttcccctacagcctcctc
146 atcttcatctatgatgaggtccgaaagctcatcctgcggcggtatcctggtgaccttgca
147 atcacaaaaggttcttctggtgagtgcaagagcctgagactggaaaaggtggacttgtct
148 cccagtcgaggctgctttcttcccacagttgagctcgggcagctctttctggggatagct
149 atggggctttgggggaagaaagggacagtggctccccatgaccagagtccaagacgaaga
150 cctaaaaaagggcttatcaagaaaaaaatggtgaagagggaaaaacagaagcgcaatatg
151 gaggaactgaagaaggaagtggtcatggatgatcacaaattaaccttggaagagctgagc
152 accaagtactccgtggacctgacaaagggccatagccaccaaagggcaaaggaaatcctg
153 actcgaggtggacccaatactgttaccccaccccccaccactccagaatgggtcaaattc
154 tgtaagcaactgttcggaggcttctccctcctactatggactggggccattctctgcttt
155 gtggcctacagcatccagatatatttcaatgaggagcctaccaaagacaacctctacctg
156 agcatcgtactgtccgtcgtggtcatcgtcactggctgcttctcctattatcaggaggcc
157 aagagctccaagatcatggagtcttttaagaacatggtgcctcagcaagctctggtaatt
158 cgaggaggagagaagatgcaaattaatgtacaagaggtggtgttgggagacctggtggaa
159 atcaagggtggagaccgagtccctgctgacctccggcttatctctgcacaaggatgtaag
160 gtggacaactcatccttgactggggagtcagaaccccagagccgctcccctgacttcacc
161 catgagaaccctctggagacccgaaacatctgcttcttttccaccaactgtgtggaagga
162 accgcccggggtattgtgattgctacgggagactccacagtgatgggcagaattgcctcc
163 ctgacgtcaggcctggcggttggccagacacctatcgctgctgagatcgaacacttcatc
164 catctgatcactgtggtggccgtcttccttggtgtcactttttttgcgctctcacttctc
165 ttgggctatggttggctggaggctatcatttttctcattggcatcattgtggccaatgtg
166 cctgaggggctgttggccacagtcactgtgtgcctgaccctcacagccaagcgcatggcg
167 cggaagaactgcctggtgaagaacctggaggcggtggagacgctgggctccacgtccacc
168 atctgctcagacaagacgggcaccctcacccagaaccgcatgaccgtcgcccacatgtgg
169 tttgatatgaccgtgtatgaggccgacaccactgaagaacagactggaaaaacatttacc
170 aagagctctgatacctggtttatgctggcccgaatcgctggcctctgcaaccgggctgac
171 tttaaggctaatcaggagatcctgcccattgctaagagggccacaacaggtgatgcttcc
172 gagtcagccctcctcaagttcatcgagcagtcttacagctctgtggcggagatgagagag
173 aaaaaccccaaggtggcagagattccctttaattctaccaacaagtaccagatgtccatc
174 caccttcgggaggacagctcccagacccacgtactgatgatgaagggtgctccggagagg
175 atcttggagttttgttctacctttcttctgaatgggcaggagtactcaatgaacgatgaa
176 atgaaggaagccttccaaaatgcctacttagaactgggaggtctgggggaacgtgtgcta
177 ggcttctgcttcttgaatctgcctagcagcttctccaagggattcccatttaatacagat
178 gaaataaatttccccatggacaacctttgttttgtgggcctcatatccatgattgaccct
179 ccccgagctgcagtgcctgatgctgtgagcaagtgtcgcagtgcaggaattaaggtgatc
180 atggtaacaggagatcatcccattacagctaaggccattgccaagggtgtgggcatcatc
181 tcagaaggcactgagacggcagaggaagtcgctgcccggcttaagatccctatcagcaag
182 gtcgatgccagtgctgccaaagccattgtggtgcatggtgcagaactgaaggacatacag
183 tccaagcagcttgatcagatcctccagaaccaccctgagatcgtgtttgctcggacctcc
184 cctcagcagaagctcatcattgtcgagggatgtcagaggctgggagccgttgtggccgtg
185 acaggtgacggggtgaacgactcccctgcgctgaagaaggctgacattggcattgccatg
186 ggcatctctggctctgacgtctctaagcaggcagccgacatgatcctgctggatgacaac
187 tttgcctccatcgtcacgggggtggaggagggccgcctgatctttgacaacctgaagaaa
188 tccatcatgtacaccctgaccagcaacatccccgagatcacgcccttcctgatgttcatc
189 atcctcggtatacccctgcctctgggaaccataaccatcctctgcattgatctcggcact
190 gacatggtccctgccatctccttggcttatgagtcagctgaaagcgacatcatgaagagg
191 cttccaaggaacccaaagacggataatctggtgaaccaccgtctcattggcatggcctat
192 ggacagattgggatgatccaggctctggctggattctttacctactttgtaatcctggct
193 gagaatggttttaggcctgttgatctgctgggcatccgcctccactgggaagataaatac
194 ttgaatgacctggaggacagctacggacagcagtggacctatgagcaacgaaaagttgtg
195 gagttcacatgccaaacggccttttttgtcaccatcgtggttgtgcagtgggcggatctc
196 atcatctccaagactcgccgcaactcacttttccagcagggcatgagaaacaaagtctta
197 atatttgggatcctggaggagacactcttggctgcatttctgtcctacactccaggcatg
198 gacgtggccctgcgaatgtacccactcaagataacctggtggctctgtgccattccctac
199 agtattctcatcttcgtctatgatgaaatcagaaaactcctcatccgtcagcacccggat
200 ggctgggtggaaagggagacgtactactaa
205 Gn.Ex : gene number, exon number (for reference)
206 Type : Init = Initial exon
209 Sngl = Single-exon gene
212 S : DNA strand (+ = input strand; - = opposite strand)
213 Begin : beginning of exon or signal (numbered on input strand)
214 End : end point of exon or signal (numbered on input strand)
215 Len : length of exon or signal (bp)
216 Fr : reading frame (a codon ending at x is in frame f = x mod 3)
217 Ph : net phase of exon (length mod 3)
218 I/Ac : initiation signal or acceptor splice site score (x 10)
219 Do/T : donor splice site or termination signal score (x 10)
220 CodRg : coding region score (x 10)
221 P : probability of exon (sum over all parses containing exon)
222 Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores)
226 The SCORE of a predicted feature (e.g., exon or splice site) is a
227 log-odds measure of the quality of the feature based on local sequence
228 properties. Thus, for example, a predicted donor splice site with
229 score > 100 is excellent; 50-100 is acceptable; 0-50 is weak; and
230 below 0 is poor (probably not a real donor site).
232 The PROBABILITY of a predicted exon is the estimated probability under
233 GENSCAN's model of genomic sequence structure that the exon is correct.
234 This probability depends in general on global as well as local sequence
235 properties. This information can be used to assess the reliability of the
236 predicted exon, e.g., it would be better to design PCR primers based on
237 a predicted exon with probability > 0.95 than one with lower probability.