4 The `bdgpeakcall` command is part of the MACS3 suite of tools and is
5 used to call peaks from a single bedGraph track for scores.
7 ## Detailed Description
8 The `bdgpeakcall` command takes an input bedGraph file, a cutoff
9 value, and the options to control peak width, then produces an output
10 file with peaks called. This tool can be used as a generic peak caller
11 that can be applied to any scoring system in a BedGraph file, no
12 matter the score is the pileup, the p-value, or the fold change -- or
13 any other measurement to decide the 'significancy' of the genomic
16 Note: All regions on the same chromosome in the bedGraph file should
19 ## Command Line Options
21 Here is a brief overview of the command line options for `bdgpeakcall`:
23 - `-i` or `--ifile`: MACS score, or any numerical measurement score in
24 bedGraph. The only assumption on the score is that higher the score
25 is, more significant the genomic loci is. REQUIRED
26 - `-c` or `--cutoff`: Cutoff depends on which method you used for the
27 score track. If the file contains -log10(p-value) scores from
28 MACS3, score 5 means pvalue 1e-5. Regions with signals lower than
29 the cutoff will not be considered as enriched regions. DEFAULT: 5
30 - `-l` or `--min-length`: Minimum length of peak, better to set it as
31 d value if we will deal with MACS3 generated scores. DEFAULT: 200
32 - `-g` or `--max-gap`: Maximum gap between significant points in a
33 peak, better to set it as the tag size if we will deal with MACS3
34 generated scores. DEFAULT: 30
35 - `--cutoff-analysis`: While set, bdgpeakcall will analyze the number
36 or total length of peaks that can be called by different cutoff
37 then output a summary table to help the user decide a better
38 cutoff. Note, minlen and maxgap may affect the results. Also, if
39 this option is on, `bdgpeakcall` will analyze the outcome of
40 different cutoff values and then quit without calling any peaks.
42 - `--no-trackline`: Tells MACS not to include a trackline with
43 bedGraph files. The trackline is used by UCSC for the options for
45 - `--verbose`: Set the verbose level of runtime messages. 0: only show
46 critical messages, 1: show additional warning messages, 2: show
47 process information, 3: show debug messages. DEFAULT: 2
48 - `--outdir`: If specified, all output files will be written to that
49 directory. Default: the current working directory
50 - `-o` or `--ofile`: Output file name. Mutually exclusive with
52 - `--o-prefix`: Output file prefix. Mutually exclusive with
58 Here is an example of how to use the `bdgpeakcall` command:
61 macs3 bdgpeakcall -i input.bedGraph -o output.narrowPeak --cutoff 1.0 --minlen 500 --maxgap 1000
64 In this example, the program will call peaks from the `input.bedGraph`
65 file and write the result to `output.narrowPeak`. The cutoff for
66 calling peaks is set to 1.0, the minimum length of peaks is set to
67 500, and the maximum gap between peaks is set to 1000.
71 The cutoff analysis function is provided by `--cutoff-analysis` option
72 in `callpeak`, `bdgpeakcall`, and `hmmratac`. However, the function is
73 `bdgpeakcall` is more flexible and can be applied on any scoring
74 scheme. We will separate this function into a dedicated subcommand in
77 Please note that if this `--cutoff-anlaysis` option is on, the
78 `bdgpeakcall` won't write any results of the peaks into narrowPeak
79 format file, ignoring `-c` you specified. Instead, it will write a
80 cutoff analysis report (`-o`) and quit.
82 When the option is on, we will first calculate the window of step for
83 increasing the cutoff from the lowest (`min_v`) to the highest
84 (`max_v`), using the `--cutoff-analysis-steps`:
86 `win_step = (max_v-min_v)/steps`.
88 Then for each cutoff we plan to investigate, we will check the number
89 of peaks that can be called, their average peak length, and their
90 total length. The smallest cutoff (close to `min_v`) and any cutoff
91 that can't lead to any peak will be excluded in the final report.
93 The report consists of four columns:
95 1. score: the possible fold change cutoff value.
96 2. npeaks: the number of peaks under this cutoff.
97 3. lpeaks: the total length of all peaks.
98 4. avelpeak: the average length of peaks.
100 While there's no universal rule to suggest the best cutoff, here are a
103 - You can use elbow analysis to find the cutoff that dramatically
104 change the trend of npeaks, lpeaks, or avelpeak. But you need to
105 think about how to define 'dramatical change'.
106 - You can use some common expectation to decide the cutoff. For
107 example, the number of peaks should be thousands/ or the avelpeak
108 should be around 500bps. Of course, it's arbitrary but the table
109 will give you some insight.