3 my $page=CXGN
::Page
->new('Labate_tomato_tech.html','html2pl converter');
4 $page->header('SNP Consortium Pre-proposal');
8 <h2>SNP Consortium Pre-proposal</h2>
10 <p>Joanne Labate and Angela Baldo, USDA Geneva</p>
12 <p>Cultivated tomato (<em>Lycopersicon esculentum</em> var. <em>esculentum</em>
13 is known to have relatively low molecular genetic diversity.
14 This limited genetic variation has restricted the use of molecular
15 markers as tools for genetic studies or crop improvement. The
16 USDA, ARS Plant Genetic Resources Unit (PGRU) in Geneva, NY
17 currently conserves more than 5,000 accessions of cultivated
18 tomato. We are developing DNA sequence-based molecular markers
19 (Single Nucleotide Polymorphisms, known as SNPs) to
20 characterize our collection. Polymorphic markers that we
21 develop will also be useful to our stakeholders (e.g. tomato
22 breeders) for saturating genetic maps and marker-assisted
25 <p>Tomato is ideal for pioneering SNP prediction and confirmation because of the
26 wealth of publicly available DNA sequence data. Baldo has
27 designed and implemented a high-throughput Expressed Sequence
28 Tag (EST) analysis system which takes a NCBI Unigene set as
29 input, and produces high-quality annotation and consensus
30 sequences of subclusters, virtual mapping of consensus
31 sequences by matching to known markers, SNP predictions, and
32 SSR discovery. An additional module designs and selects optimal
33 PCR primers flanking the regions of interest. Using the more
34 than 150,000 EST and cDNA sequences from over 15 cultivars
35 comprising the 3,000+ member Unigene set for <em>L.
36 esculentum</em> in Genbank, 2,527 SNPs in 764 Unigenes were predicted. In 2004 we tested 85
37 independent amplicons from the 764 Unigenes for predicted SNPs
38 by sequencing two or three cultivars per amplicon. We
39 discovered 62 SNPs and 13 small insertion/deletion
40 polymorphisms in 21 amplicons. For the 64 remaining amplicons,
41 one primer pair did not amplify, thirty two showed no evidence
42 of the predicted SNP, 20 appeared heterozygous or gave multiple
43 PCR bands, and 11 gave insufficient data (poor quality
44 sequence, fragment too large to sequence predicted SNP site, or
45 data from only one cultivar). Based on the 53 amplicons that
46 gave clear results thus far, this method discovered cultivated
47 tomato SNPs with approximately 21-fold more efficiency compared
48 to sequencing random genomic DNA (1 SNP per 300 nucleotides
49 versus 1 SNP per 7 kb). We have submitted a manuscript for
50 publication of the 21 polymorphic markers and are continuing development
51 and testing of the remaining set of 64 of the 85 originally
55 develop and test the remaining 679 Unigenes with predicted SNPs
56 in a collaborative effort with private companies such as
57 Campbell's, Western Seed, Syngenta/Roger's Seed,
58 Rijk Zwaan, Zeraim Gedera, and other interested parties. Based
59 on our results we anticipate successfully developing at least
60 25\% of the 764 Unigenes (191 loci) into polymorphic markers by
61 sequencing two to three tomato cultivars with predicted SNPs
62 among them. This will entail - obtaining the minimal set of
63 cultivars needed based on NCBI's original EST data,
64 growing the cultivars in PGRU's greenhouse and isolating
65 DNA from leaf tissue, PCR primer design and PCR amplification
66 for the remaining 679 Unigenes for two or three cultivars each,
67 DNA sequencing, and analyses of the sequence data. The
68 anticipated cost of carrying out this project in-house at PGRU
69 is detailed in the attached budget pages. Products will be
70 robust PCR and sequencing primers for genomic DNA sequences,
71 identification of polymorphic sites among the assayed cultivars
72 for those sequences, and annotation of sequences including
73 predicted proteins and matches to known tomato markers. All
74 results will be published in peer-reviewed journals and all
75 collaborators will have access to preliminary and pre-published
79 research is part of PGRU's "Conservation of
80 Vegetable Crops" CRIS, Objective 2 Enhance the
81 effectiveness of germplasm maintenance through the application
82 of genomic sequencing and molecular marker techniques.
83 It will serve as a model for discovery of enhanced value within crop
84 germplasm when faced by limited molecular genetic diversity,
85 ensuring the future viability of U.S. farms and a nutritious
90 <table summary="" border="0" cellspacing="0" cellpadding="0" width="40\%">
92 <td width="220"><strong><u>Item</u></strong></td>
93 <td width="75"><strong><u>Year 1</u></strong></td>
94 <td width="75"><strong><u>Year 2</u></strong></td>
95 <td width="75"><strong><u>Total</u></strong></td>
97 <tr><td><p> </p></td></tr>
99 <td><strong>Materials</strong></td>
104 <tr><td><p> </p></td></tr>
106 <td>PCR and DNA sequencing<br /> (\$4.95 per sample)</td>
111 <tr><td><p> </p></td></tr>
118 <tr><td><p> </p></td></tr>
120 <td>Instrument maintenance</td>
125 <tr><td><p> </p></td></tr>
127 <td><strong>Labor</strong></td>
132 <tr><td><p> </p></td></tr>
134 <td>Technician (GS-4)</td>
139 <tr><td><p> </p></td></tr>
141 <td> salary</td>
146 <tr><td><p> </p></td></tr>
148 <td> benefits</td>
153 <tr><td><p> </p></td></tr>
155 <td>3 Supervisory scientists<br /> (5\% each)</td>
160 <tr><td><p> </p></td></tr>
162 <td><strong>Total materials and labor</strong></td>
167 <tr><td><p> </p></td></tr>
169 <td><strong>Other Direct Costs</strong></td>
174 <tr><td><p> </p></td></tr>
176 <td>PGRU overhead IRC (0.1905)</td>
181 <tr><td><p> </p></td></tr>
183 <td><strong>Total Direct Costs</strong></td>
188 <tr><td><p> </p></td></tr>
190 <td><strong>Indirect Costs</strong></td>
195 <tr><td><p> </p></td></tr>
197 <td>ARS overhead (0.1111)</td>
202 <tr><td><p> </p></td></tr>
204 <td><strong>Total Direct and<br /> Indirect Costs</strong></td>
205 <td><strong>\$99,662</strong></td>
206 <td><strong>\$90,832</strong></td>
207 <td><strong>\$190,495</strong></td>
211 <br /><br /><br /><br />
213 <h4>Breakdown of Sequencing Costs</h4>
215 <table summary="" border="0" cellspacing="0" cellpadding="0" width="50\%">
217 <td width="89\%">number of genes</td>
218 <td align="right">700</td>
220 <tr><td><p> </p></td></tr>
222 <td>number of cultivars</td>
223 <td align="right">3</td>
225 <tr><td><p> </p></td></tr>
227 <td>sequencing reactions (forward and reverse)</td>
228 <td align="right">2</td>
230 <tr><td><p> </p></td></tr>
232 <td>cost per sequence</td>
235 <tr><td><p> </p></td></tr>
240 <tr><td><p> </p></td></tr>
242 <td>number of additional reactions (25\%) for redos</td>
243 <td align="right">1050</td>
245 <tr><td><p> </p></td></tr>
247 <td>cost for redos</td>
250 <tr><td><p> </p></td></tr>
252 <td>total sequencing cost</td>
255 <tr><td><p> </p></td></tr>
257 <td>total number of sequencing reactions</td>
258 <td align="right">5250</td>
264 <h4>Breakdown of Primer Costs</h4>
266 <table summary="" border="0" cellspacing="0" cellpadding="0" width="50\%">
268 <td width="89\%">number of primer pairs</td>
269 <td align="right">700</td>
271 <tr><td><p> </p></td></tr>
273 <td>forward and reverse</td>
274 <td align="right">2</td>
276 <tr><td><p> </p></td></tr>
278 <td>size of primer</td>
279 <td align="right">23</td>
281 <tr><td><p> </p></td></tr>
283 <td>cost per base</td>
286 <tr><td><p> </p></td></tr>
291 <tr><td><p> </p></td></tr>
293 <td>number of additional primers (15\%) added for redesigns</td>
294 <td align="right">210</td>
296 <tr><td><p> </p></td></tr>
298 <td>cost of redesigned primers</td>
301 <tr><td><p> </p></td></tr>
303 <td>total primer cost</td>
306 <tr><td><p> </p></td></tr>
308 <td>total number of primers</td>
309 <td align="right">1610</td>
315 <h4>Estimated costs of PCR and sequencing per sample</h4>
317 <table summary="" border="0" cellspacing="0" cellpadding="0" width="70\%">
319 <td width="25\%"></td>
320 <td width="10\%"></td>
321 <td width="10\%"><strong>cost per unit</strong></td>
322 <td width="10\%"><strong>number used<br />per sample</strong></td>
323 <td width="10\%"><strong>cost per sample</strong></td>
325 <tr><td><p> </p></td></tr>
327 <td>10ul unplugged pipette tips</td>
333 <tr><td><p> </p></td></tr>
335 <td>200ul unplugged pipette tips</td>
341 <tr><td><p> </p></td></tr>
343 <td>10ul plugged pipette tips</td>
349 <tr><td><p> </p></td></tr>
351 <td>200ul plugged pipette tips</td>
357 <tr><td><p> </p></td></tr>
359 <td>large orifice 200ul unplugged pipette tips</td>
365 <tr><td><p> </p></td></tr>
367 <td>large orifice 200ul plugged pipette tips</td>
373 <tr><td><p> </p></td></tr>
375 <td>50ul plugged pipette tips</td>
381 <tr><td><p> </p></td></tr>
383 <td>black 96-well clini plates</td>
390 <tr><td><p> </p></td></tr>
398 <tr><td><p> </p></td></tr>
407 <tr><td><p> </p></td></tr>
409 <td>Molecular Probes picogreen reagent<br />(1 includes forward and reverse)</td>
415 <tr><td><p> </p></td></tr>
423 <tr><td><p> </p></td></tr>
425 <td>Promega GoTaq PCR enzyme</td>
431 <tr><td><p> </p></td></tr>
433 <td>EdgeBio PCR clean-up<br />(1 includes forward and reverse)
441 <tr><td><p> </p></td></tr>
443 <td>ABI BDT cycle sequencing enzyme</td>
450 <tr><td><p> </p></td></tr>
452 <td>EdgeBio sequencing rxn. clean-up</td>
458 <tr><td><p> </p></td></tr>
460 <td>ABI sequencing polymer POP6</td>
466 <tr><td><p> </p></td></tr>
468 <td>ABI 3100 capillary array</td>
474 <tr><td><p> </p></td></tr>
479 <td align="right"><strong>total:</strong></td>
480 <td><strong>\$4.95</strong></td>
486 <h4>ABI 3100 DNA sequencer maintenance</h4>
488 <table summary="" border="0" cellspacing="0" cellpadding="0" width="50\%">
490 <td width="16\%">lower polymer block</td>
491 <td width="1\%"></td>
492 <td width="1\%">\$2,200</td>
494 <tr><td><p> </p></td></tr>
496 <td>reserve polymer syringe</td>
500 <tr><td><p> </p></td></tr>
502 <td>array-fill syringe</td>
506 <tr><td><p> </p></td></tr>
508 <td>maintenance contract (10\% over 2 years)</td>
512 <tr><td><p> </p></td></tr>
515 <td align="right"><strong>total: </strong></td>
516 <td><strong>\$4,383</strong></td>
520 <br /><br /><br /><br />
522 <h3>Budget narrative</h3>
526 <p>Custom-synthesized oligos are purchased from
527 Integrated DNA Technologies, Inc. PCR/sequencing oligos cost
528 \$0.35/base at 25 nmole scale. Generally 4 pmol primer is used
529 per PCR or sequencing reaction. We estimate 1,610 primers will
530 be sufficient to PCR and sequence 700 loci taking into account
531 needs for occasionally redesigning primers and replenishing
532 primer stocks (15\% added). 23-mer x \$0.35 x 1,610 =
535 <h4>Laboratory costs</h4>
537 <p>We routinely test all newly synthesized PCR primers across the 2
538 or 3 genomic DNAs to be sequenced in small volume PCR
539 reactions. After this initial optimization a larger volume (50
540 ul) PCR reaction is performed, PCR products are cleaned using
541 an EdgeBioSystems Quickstep kit, DNA is quantified in a
542 picogreen assay, ABI BDT ver. 3.1 cycle sequencing reactions
543 are performed, samples are cleaned on an EdgeBioSystems DTR
544 plate, dried down, resuspended in formamide, and run on the ABI
545 3100 capillary sequencer.</p>
547 <p>Itemized costs per sample are in the budget
548 table. Plastic disposables include pipette tips, 96-well
549 plates, and eppendorf tubes. Disposable capillary arrays for
550 the ABI 3100 cost \$695 each and last approximately 100 runs (1
551 run = 16 samples). Costs of reagents including dNTPs, GoTaq and
552 BDT enzymes, picogreen, POP6 polymer, and EdgeBio kits are
553 estimated on a per sample basis. Cost per sequence = \$4.9513 x
554 5,250 sequences = \$25,994</p>
556 <h4>Instrument maintenance</h4>
558 <p>The throughput of our ABI 3100 DNA sequencer is two 96-well plates
559 in 30 hours. For 5,250 reactions this is equivalent to 8 months
560 of operation \@ 4 hours per day. We replace the lower polymer
561 block, array-fill syringe, and reserve polymer syringe
562 approximately every 6 months as routine maintenance. In 2004
563 the ABI service maintenance contract cost \$8,750. We request
564 \$2,192 per year to defray instrument maintenance
569 <p>One full-time ARS Biological Science Technician (GS-4) is requested
570 to carry out the majority of the labor for this project. Salary
571 was estimated at entry level with 30\% added for benefits. There
572 will be three supervisory scientists whose costs are estimated
573 at 5\% of their salary and benefits each. Their respective roles
574 can be described as follows - Vegetable Crops Curator Dr. Larry
575 Robertson will be responsible for obtaining seed of required
576 tomato cultivars, overseeing its planting and harvesting for
577 tissue collection, and subsequent sample tracking and data
578 curation. Larry has developed a database for PGRU for efficient
579 tracking of samples from planting, through all laboratory
580 assays, to final storage of molecular data. Bioinformaticist
581 Dr. Angela Baldo will be responsible for all
582 computational marker discovery: identification and downloading
583 of public EST, cDNA, and genomic sequences, high-throughput
584 in-house clustering, annotation, <em>in silico</em> mining for markers, and high-throughput
585 primer design. Molecular Biologist Dr. Joanne Labate's
586 primary responsibility on this project will be to train the
587 technician in laboratory techniques and to oversee the
588 collection and analyses of the molecular data, ensure quality
589 control of laboratory-generated data, and dissemination of high
590 quality data to collaborators.</p>
594 <p>Local overhead at PGRU is estimated as 16\% of total direct costs.
595 National overhead for ARS is estimated as 10\% of total direct
596 and indirect costs.</p>
598 <h4>Facilities and equipment</h4>
600 <p>Physical facilities include three greenhouses with a total of 7500
601 ft<sup>2</sup> for growing plants. The 1400 ft<sup>2</sup>
602 laboratory is well-equipped to efficiently generate DNA
603 sequence data. Major equipment includes an MJ Research Tetrad2 and two BioRad
604 iCycler thermocyclers, one ABI 3100 and one ABI 310 Genetic
605 Analyzer, a Tecan Genesis RSP 150 robotic liquid handling
606 system, a plant tissue grinder (GenoGrinder), a
607 refrigerated-centrifuge and a speed-vac that can hold 96-well
608 plates, an Alpha Innotech FluorChem 8900 imaging and analysis
609 system, and a 96-well plate reader for fluorescence and
610 absorbance assays. We also have all necessary small equipment that we require such
611 as microcentrifuges, incubators, shakers, autoclave, agarose
612 gel rigs and power supplies, electronic multi-channel pipettes,
613 -20°C and -80°C freezers, a freeze-drier,
614 microcomputers, etc.</p>
617 disposal a Dell PowerEdge 6600 server with four 2.4GHz Intel
618 side-bus processors, 16GB of RAM, and 876GB of storage in a
619 RAID configuration, with an additional hot spare. This server
620 currently runs Linux kernel 2.4, and shoulders the majority of
621 the Unit's computational analyses. We have two secondary servers:
622 a Dell 530 Workstation with a i686 single 1.7GHz processor
623 server with 512MB memory, and 80GB of storage; Dell PowerEdge 4100 with
624 200MHz processor with 256MB of memory and 22GB of storage, and
625 a desktop workstation, all running Linux. Finally we have a
626 Sony PCG-GRX700P notebook with i686 architecture, 2.20GHz,
627 512MB RAM, and a 60GB hard drive, running a notebook-optimized
628 version of the same Linux distribution. All five Linux machines are
629 configured such that they can read and write to the shared
630 Novell fileserver (Dell PowerEdge 2650, single processor
631 1.8GHz, 1MB RAM, with 76GB of storage) and each other's hard
633 of the computer network available for this project consists of
634 approximately 14 Dell desktop PCs running Microsoft Windows
635 connected by the Novell print and fileserver mentioned