4 The `bdgpeakcall` command is part of the MACS3 suite of tools and is
5 used to call peaks from a single bedGraph track for scores.
7 ## Detailed Description
8 The `bdgpeakcall` command takes an input bedGraph file, a cutoff
9 value, and the options to control peak width, then produces an output
10 file with peaks called. This tool can be used as a generic peak caller
11 that can be applied to any scoring system in a BedGraph file, no
12 matter the score is the pileup, the p-value, or the fold change -- or
13 any other measurement to decide the 'significancy' of the genomic
16 Note: All regions on the same chromosome in the bedGraph file should
19 ## Command Line Options
21 Here is a brief overview of the command line options for `bdgpeakcall`:
23 - `-i` or `--ifile`: MACS score, or any numerical measurement score in
24 bedGraph. The only assumption on the score is that higher the score
25 is, more significant the genomic loci is. REQUIRED
26 - `-c` or `--cutoff`: Cutoff depends on which method you used for the
27 score track. If the file contains -log10(p-value) scores from
28 MACS3, score 5 means pvalue 1e-5. Regions with signals lower than
29 the cutoff will not be considered as enriched regions. DEFAULT: 5
30 - `-l` or `--min-length`: Minimum length of peak, better to set it as
31 d value if we will deal with MACS3 generated scores. DEFAULT: 200
32 - `-g` or `--max-gap`: Maximum gap between significant points in a
33 peak, better to set it as the tag size if we will deal with MACS3
34 generated scores. DEFAULT: 30
35 - `--cutoff-analysis`: While set, bdgpeakcall will analyze the number
36 or total length of peaks that can be called by different cutoff
37 then output a summary table to help the user decide a better
38 cutoff. Note, minlen and maxgap may affect the results. Also, if
39 this option is on, `bdgpeakcall` will analyze the outcome of
40 different cutoff values and then quit without calling any peaks.
42 - `--cutoff-analysis-steps`: Steps for performing cutoff analysis. It
43 will be used to decide which cutoff value should be included in the
44 final report. Larger the value, higher resolution the cutoff
45 analysis can be. The cutoff analysis function will first find the
46 smallest (at least 0) and the largest (at most 1,000) score in the
47 bedGraph, then break the range of score into
48 `CUTOFF_ANALYSIS_STEPS` intervals. It will then use each score as
49 cutoff to call peaks and calculate the total number of candidate
50 peaks, the total basepairs of peaks, and the average length of peak
51 in basepair. Please note that the final report ideally should
52 include `CUTOFF_ANALYSIS_STEPS` rows, but in practice, if the
53 cutoff yield zero peak, the row for that value won't be included.
54 DEFAULT: 100", default = 100 )
55 - `--no-trackline`: Tells MACS not to include a trackline with
56 bedGraph files. The trackline is used by UCSC for the options for
58 - `--verbose`: Set the verbose level of runtime messages. 0: only show
59 critical messages, 1: show additional warning messages, 2: show
60 process information, 3: show debug messages. DEFAULT: 2
61 - `--outdir`: If specified, all output files will be written to that
62 directory. Default: the current working directory
63 - `-o` or `--ofile`: Output file name. Mutually exclusive with
65 - `--o-prefix`: Output file prefix. Mutually exclusive with
71 Here is an example of how to use the `bdgpeakcall` command:
74 macs3 bdgpeakcall -i input.bedGraph -o output.narrowPeak --cutoff 1.0 --minlen 500 --maxgap 1000
77 In this example, the program will call peaks from the `input.bedGraph`
78 file and write the result to `output.narrowPeak`. The cutoff for
79 calling peaks is set to 1.0, the minimum length of peaks is set to
80 500, and the maximum gap between peaks is set to 1000.
84 The cutoff analysis function is provided by `--cutoff-analysis` option
85 in `callpeak`, `bdgpeakcall`, and `hmmratac`. However, the function is
86 `bdgpeakcall` is more flexible and can be applied on any scoring
87 scheme. We will separate this function into a dedicated subcommand in
90 Please note that if this `--cutoff-anlaysis` option is on, the
91 `bdgpeakcall` won't write any results of the peaks into narrowPeak
92 format file, ignoring `-c` you specified. Instead, it will write a
93 cutoff analysis report (`-o`) and quit.
95 When the option is on, we will first calculate the window of step for
96 increasing the cutoff from the lowest (`min_v`) to the highest
97 (`max_v`), using the `--cutoff-analysis-steps`:
99 `win_step = (max_v-min_v)/steps`.
101 Then for each cutoff we plan to investigate, we will check the number
102 of peaks that can be called, their average peak length, and their
103 total length. The smallest cutoff (close to `min_v`) and any cutoff
104 that can't lead to any peak will be excluded in the final report.
106 The report consists of four columns:
108 1. score: the possible fold change cutoff value.
109 2. npeaks: the number of peaks under this cutoff.
110 3. lpeaks: the total length of all peaks.
111 4. avelpeak: the average length of peaks.
113 While there's no universal rule to suggest the best cutoff, here are a
116 - You can use elbow analysis to find the cutoff that dramatically
117 change the trend of npeaks, lpeaks, or avelpeak. But you need to
118 think about how to define 'dramatical change'.
119 - You can use some common expectation to decide the cutoff. For
120 example, the number of peaks should be thousands/ or the avelpeak
121 should be around 500bps. Of course, it's arbitrary but the table
122 will give you some insight.