1 .TH PEAR 1 "15 Jan 2015" "PEAR 0.9.6" "PEAR manual"
3 PEAR - Paired-end reads merger
10 is a paired-end reads merger for the Illumina platform.
12 \fBPEAR\fR evaluates all possible paired-end read overlaps and does
13 not require the target fragment size as input. It also implements
14 a statistical test for minimizing false-positive results. The highly optimized
15 and parallelized implementation allows for merging millions of paired-end reads
16 within a few minutes on a standard desktop computer.
18 Using \fBPEAR\fR is very easy. Invoke it from the prompt of your command interpreter as follows:
24 shell> \fBpear \-f\fR \fB\fIforward-fastq\fR\fR\fB \-r \fR\fB\fIreverse-fastq\fR\fR\fB \-o \fR\fB\fIouput\fR\fR
31 \fB\-f\fR, \fB\-\-forward\-fastq=\fIFILENAME\fR\fR
32 Forward paired-end FASTQ file
34 \fB\-r\fR, \fB\-\-reverse\-fastq=\fIFILENAME\fR\fR
35 Reverse paired-end FASTQ file
37 \fB\-o\fR, \fB\-\-output=\fIFILENAME\fR\fR
40 \fB\-p\fR, \fB\-\-p\-value=\fIPVALUE\fR\fR
41 Specify the value \fIPVALUE\fR as the p-value for the statistical test. If the computer
42 p-value of a possible merging exceeds the specified p-value then the paired-end read will
43 not be merged. Valid options are: \fB0.0001\fR, \fB0.001\fR, \fB0.01\fR, \fB0.05\fR and
44 \fB1.0\fR. Setting \fB1.0\fR disables the test. (default: \fB0.01\fR)
46 \fB\-v\fR, \fB\-\-min\-overlap=\fIVALUE\fR\fR
47 Set \fIVALUE\fR as the minimum overlap size. The minimum overlap may be set to \fB1\fR when
48 the statistical test is used. However, further restricting the minimum overlap size to a
49 proper value may reduce false-positive assemblies. (default: \fB10\fR)
51 \fB\-m\fR, \fB\-\-max\-assembly\-length=\fIVALUE\fR\fR
52 Set \fIVALUE\fR as the maximum possible length of the assembled sequences. Setting this
53 value to \fB0\fR disables the restriction and assembled sequences may be arbitrarily long (default: \fB0\fR)
55 \fB\-n\fR, \fB\-\-min\-assembly\-length=\fIVALUE\fR\fR
56 Set \fIVALUE\fR as the minimum possible length of the assembled sequences. Setting this
57 value to \fB0\fR disables the restriction and assembled sequences may be arbitrarily long (default: \fB0\fR)
59 \fB\-t\fR, \fB\-\-min-trim-length=\fIVALUE\fR\fR
60 Sets the minimum length of reads after trimming the low quality part (see option \fB\-q\fR) to \fIVALUE\fR.
63 \fB\-q\fR, \fB\-\-quality\-threshold=\fIVALUE\fR\fR
64 Sets the quality score threshold for trimming the low quality part of a read to \fIVALUE\fR. If the quality scores
65 of two consecutive bases are strictly less than the specified threshold, the rest of the read will
66 be trimmed. (default: \fB0\fR)
68 \fB\-u\fR, \fB\-\-max\-uncalled\-base=\fIVALUE\fR\fR
69 Sets the maximal proportion of uncalled bases in a read to \fIVALUE\fR. Setting this value to
70 \fB0\fR will cause \fBPEAR\fR to discard all reads that contain uncalled bases. The other extreme
71 setting is \fB1\fR which causes \fBPEAR\fR to process all reads independent on the number of
72 uncalled bases. (default: \fB1\fR)
74 \fB\-g\fR, \fB\-\-test\-method=\fITYPE\fR\fR
75 Specifies the type of statistical test. Two options are available, \fB1\fR and \fB2\fR. (default: \fB1\fR)
78 \fB1\fR: Given the minimum allowed overlap, test using the highest OES. Note that due to its
79 discrete nature, this test usually yields a lower p-value for the assembled read than the cut-off (specified by \fB\-p\fR).
80 For example, setting the cut-off to \fB0.05\fR using this test, the assembled reads might have an actual p-value
84 \fB2\fR: Use the acceptance probability (m.a.p). This test method computes the same probability as test method \fB1\fR. However,
85 it assumes that the minimal overlap is the observed overlap with the highest OES, instead of the one
86 specified by \fB\-v\fR. Therefore, this is not a valid statistical test and the 'p\-value' is in fact the
87 maximal probability for accepting the assembly. Nevertheless, in practice, test \fB2\fR can correctly assemble
88 more reads with only slightly higher false-positive rate when the actual overlap sizes are relatively small.
90 \fB\-e\fR, \fB\-\-empirical\-freqs\fR
91 Disable empirical base frequencies. (default: use empirical base frequencies)
93 \fB\-s\fR, \fB\-\-score\-method=\fIMETHOD\fR\fR
94 Specify the scoring method. Three options are available, \fB1\fR, \fB2\fR and \fB3\fR. (default: \fB2\fR)
97 \fB1\fR: OES with +1 for match and -1 for mismatch
100 \fB2\fR: Assembly score (AS). Use +1 for match and -1 for mismatch multiplied by base quality scores
103 \fB3\fR: Ignore quality scores and use +1 for a match and -1 for a mismatch
105 \fB\-b\fR, \fB\-\-phred\-base=\fIVALUE\fR\fR
106 Sets the base PHRED quality score to \fIVALUE\fR. (default: \fB33\fR)
108 \fB\-y\fR, \fB\-\-memory=\fISIZE\fR\fR
109 Specifies the amount of memory to be used. The number may be followed by one of the letters \fBK\fR, \fBM\fR, or \fBG\fR
110 denoting Kilobytes, Megabytes and Gigabytes, respectively. Bytes are assumed in case no letter is specified. (default: \fB200M\fR)
112 \fB\-j\fR, \fB\-\-threads=\fITHREADS\fR\fR
113 Use \fITHREADS\fR number of threads
115 \fB\-c\fR, \fB\-\-cap=\fIVALUE\fR\fR
116 Specify the upper bound for the resulting quality score. If set to zero, capping is disabled. (default: \fB40\fR)
118 \fB\-z\fR, \fB\-\-nbase\fR
119 When merging a base-pair that consists of two non equal bases out of which none is degenerate, set the merged base to \fBN\fR, with the highest quality score of the two bases.
121 \fB\-h\fR, \fB\-\-help\fR
124 \fBTomas Flouri\fR <Tomas.Flouri@h\-its.org>
126 \fBJiajie Zhang\fR <Jiajie.Zhang@h-its.org>
128 \fBKassian Kobert\fR <Kassian.Kobert@h-its.org>
130 \fBAlexandros Stamatakis\fR <Alexandros.Stamatakis@h-its.org>
132 Report \fBPEAR\fR bugs to \fBpear-users@googlegroups.com\fR
134 For more information, please refer to the \fBPEAR\fR, which is available online at \fBhttp://www.exelixis-lab.org/web/software/pear\fR