supernova: fix boost.thread move semantics
[supercollider.git] / Help / UGens / MachineListening / Onsets.html
blob22ef837d53f23ccba0fcd154483dda22bb91d739
1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2 <html>
3 <head>
4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
5 <meta http-equiv="Content-Style-Type" content="text/css">
6 <title></title>
7 <meta name="Generator" content="Cocoa HTML Writer">
8 <meta name="CocoaVersion" content="949.43">
9 <style type="text/css">
10 p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 18.0px Helvetica}
11 p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica; min-height: 14.0px}
12 p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica}
13 p.p4 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Monaco}
14 p.p5 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Monaco; color: #ad140d}
15 p.p6 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Monaco; color: #606060}
16 p.p7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Monaco; min-height: 12.0px}
17 p.p8 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Monaco; color: #ad140d; min-height: 12.0px}
18 p.p9 {margin: 0.0px 0.0px 0.0px 33.0px; font: 12.0px Arial}
19 span.s1 {color: #001bb9}
20 span.s2 {color: #ad140d}
21 span.s3 {color: #000000}
22 span.s4 {color: #2c7014}
23 span.s5 {text-decoration: underline ; color: #2c7014}
24 span.Apple-tab-span {white-space:pre}
25 ul.ul1 {list-style-type: disc}
26 ul.ul2 {list-style-type: circle}
27 </style>
28 </head>
29 <body>
30 <p class="p1"><b>Onsets<span class="Apple-tab-span"> </span><span class="Apple-tab-span"> </span>Onset detector</b></p>
31 <p class="p2"><br></p>
32 <p class="p3"><span class="Apple-tab-span"> </span><b>Onsets.kr(chain, threshold, odftype)</b></p>
33 <p class="p2"><br></p>
34 <p class="p3">An onset detector for musical audio signals - detects the beginning of notes/drumbeats/etc. Outputs a control-rate trigger signal which is 1 when an onset is detected, and 0 otherwise.</p>
35 <p class="p2"><br></p>
36 <p class="p3"><b>chain</b> - an <a href="SC://FFT"><span class="s1">FFT</span></a> chain</p>
37 <p class="p3"><b>threshold</b> - the detection threshold, typically between 0 and 1, although in rare cases you may find values outside this range useful</p>
38 <p class="p3"><b>odftype</b> - the function used to analyse the signal (options described below; OK to leave this at its default value)</p>
39 <p class="p2"><br></p>
40 <p class="p3">For the FFT chain, you should typically use a frame size of 512 or 1024 (at 44.1 kHz sampling rate) and 50% hop size (which is the default setting in SC). For different sampling rates choose an FFT size to cover a similar time-span (around 10 to 20 ms).</p>
41 <p class="p2"><br></p>
42 <p class="p3">The onset detection should work well for a general range of monophonic and polyphonic audio signals. The onset detection is purely based on signal analysis and does not make use of any "top-down" inferences such as tempo.</p>
43 <p class="p2"><br></p>
44 <p class="p2"><br></p>
45 <p class="p3"><b>Example</b></p>
46 <p class="p2"><br></p>
47 <p class="p4">(</p>
48 <p class="p4">s.boot.doWhenBooted {</p>
49 <p class="p5"><span class="Apple-tab-span"> </span>// Prepare the buffers</p>
50 <p class="p4"><span class="s2"><span class="Apple-tab-span"> </span></span>b = <span class="s1">Buffer</span>.alloc(s, 512);</p>
51 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>// Feel free to load a more interesting clip!</p>
52 <p class="p5"><span class="Apple-tab-span"> </span>// a11wlk01 is not an ideal example of musical onsets.</p>
53 <p class="p6"><span class="s2"><span class="Apple-tab-span"> </span></span><span class="s3">d = </span><span class="s1">Buffer</span><span class="s3">.read(s, </span>"sounds/a11wlk01.wav"<span class="s3">);</span></p>
54 <p class="p4">};</p>
55 <p class="p4">)</p>
56 <p class="p7"><br></p>
57 <p class="p5">////////////////////////////////////////////////////////////////////////////////////////////////</p>
58 <p class="p5">// Move the mouse to vary the threshold</p>
59 <p class="p4">(</p>
60 <p class="p4">x = {</p>
61 <p class="p4"><span class="Apple-tab-span"> </span><span class="s1">var</span> sig, chain, onsets, pips;</p>
62 <p class="p7"><span class="Apple-tab-span"> </span></p>
63 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>// A simple generative signal</p>
64 <p class="p4"><span class="Apple-tab-span"> </span>sig = <span class="s1">LPF</span>.ar(<span class="s1">Pulse</span>.ar(<span class="s1">TIRand</span>.kr(63,75,<span class="s1">Impulse</span>.kr(2)).midicps), <span class="s1">LFNoise2</span>.kr(0.5).exprange(100, 10000)) * <span class="s1">Saw</span>.ar(2).range(0, 1);</p>
65 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>// or, uncomment this line if you want to play the buffer in</p>
66 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>//sig = PlayBuf.ar(1, d, BufRateScale.kr(d), loop: 1);</p>
67 <p class="p7"><span class="Apple-tab-span"> </span></p>
68 <p class="p4"><span class="Apple-tab-span"> </span>chain = <span class="s1">FFT</span>(b, sig);</p>
69 <p class="p7"><span class="Apple-tab-span"> </span></p>
70 <p class="p4"><span class="Apple-tab-span"> </span>onsets = <span class="s1">Onsets</span>.kr(chain, <span class="s1">MouseX</span>.kr(0,1), <span class="s4">\rcomplex</span>);</p>
71 <p class="p7"><span class="Apple-tab-span"> </span></p>
72 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>// You'll hear percussive "ticks" whenever an onset is detected</p>
73 <p class="p4"><span class="Apple-tab-span"> </span>pips = <span class="s1">WhiteNoise</span>.ar(<span class="s1">EnvGen</span>.kr(<span class="s1">Env</span>.perc(0.001, 0.1, 0.2), onsets));</p>
74 <p class="p4"><span class="Apple-tab-span"> </span><span class="s1">Out</span>.ar(0, <span class="s1">Pan2</span>.ar(sig, -0.75, 0.2) + <span class="s1">Pan2</span>.ar(pips, 0.75, 1));</p>
75 <p class="p4">}.play;</p>
76 <p class="p4">)</p>
77 <p class="p5"><span class="s3">x.free; </span>// Free the synth</p>
78 <p class="p7"><br></p>
79 <p class="p7"><br></p>
80 <p class="p7"><br></p>
81 <p class="p5">////////////////////////////////////////////////////////////////////////////////////////////////</p>
82 <p class="p5">// Or we could expand this multichannel, run a series of different thresholds at the same time,<span class="Apple-converted-space"> </span></p>
83 <p class="p5">// to sonify the effect of the threshold value.</p>
84 <p class="p5">// A little hard to listen to at first: try and identify a pitch at which the best sort of<span class="Apple-converted-space"> </span></p>
85 <p class="p5">// detection is happening.</p>
86 <p class="p5">// You'll hear "bobbling" at low pitches where the threshold is definitely too low.</p>
87 <p class="p8"><br></p>
88 <p class="p4">(</p>
89 <p class="p4">var threshes = (0.1, 0.2 .. 1);</p>
90 <p class="p4">x = {</p>
91 <p class="p4"><span class="Apple-tab-span"> </span><span class="s1">var</span> sig, chain, onsets, pips;</p>
92 <p class="p7"><span class="Apple-tab-span"> </span></p>
93 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>// A simple generative signal</p>
94 <p class="p4"><span class="Apple-tab-span"> </span>sig = <span class="s1">LPF</span>.ar(<span class="s1">Pulse</span>.ar(<span class="s1">TIRand</span>.kr(63,75,<span class="s1">Impulse</span>.kr(2)).midicps), <span class="s1">LFNoise2</span>.kr(0.5).exprange(100, 10000)) * <span class="s1">Saw</span>.ar(2).range(0, 1);</p>
95 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>// or, uncomment this line if you want to play the buffer in</p>
96 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>//sig = PlayBuf.ar(1, d, BufRateScale.kr(d), loop: 1);</p>
97 <p class="p7"><span class="Apple-tab-span"> </span></p>
98 <p class="p4"><span class="Apple-tab-span"> </span>chain = <span class="s1">FFT</span>(b, sig);</p>
99 <p class="p7"><span class="Apple-tab-span"> </span></p>
100 <p class="p4"><span class="Apple-tab-span"> </span>onsets = <span class="s1">Onsets</span>.kr(chain, threshes, <span class="s4">\rcomplex</span>);</p>
101 <p class="p7"><span class="Apple-tab-span"> </span></p>
102 <p class="p5"><span class="s3"><span class="Apple-tab-span"> </span></span>// Generate pips at a variety of pitches</p>
103 <p class="p4"><span class="Apple-tab-span"> </span>pips = <span class="s1">SinOsc</span>.ar((threshes).linexp(0, 1, 440, 3520), 0, <span class="s1">EnvGen</span>.kr(<span class="s1">Env</span>.perc(0.001, 0.1, 0.5), onsets)).mean;</p>
104 <p class="p4"><span class="Apple-tab-span"> </span><span class="s1">Out</span>.ar(0, <span class="s1">Pan2</span>.ar(sig, -0.75, 0.2) + <span class="s1">Pan2</span>.ar(pips, 0.75, 1));</p>
105 <p class="p4">}.play;</p>
106 <p class="p4">)</p>
107 <p class="p5"><span class="s3">x.free; </span>// Free the synth</p>
108 <p class="p7"><br></p>
109 <p class="p7"><br></p>
110 <p class="p5"><span class="s3">[b,d].do(</span><span class="s1">_</span><span class="s3">.free); </span>// Free the buffers</p>
111 <p class="p2"><br></p>
112 <p class="p2"><br></p>
113 <p class="p3">The <b>type</b> argument chooses which <i>onset detection function</i> is used. In many cases the default will be fine. The following choices are available:</p>
114 <p class="p2"><br></p>
115 <ul class="ul1">
116 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><span class="s4">\power</span><span class="Apple-converted-space">    </span>- generally OK, good for percussive input, and also very efficient</li>
117 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><span class="s4">\magsum</span><span class="Apple-converted-space">    </span>- generally OK, good for percussive input, and also very efficient</li>
118 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><span class="s4">\complex</span><span class="Apple-converted-space">  </span>- performs generally very well, but more CPU-intensive</li>
119 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><span class="s4">\rcomplex</span> - performs generally very well, and slightly more efficient than <span class="s4">\complex</span></li>
120 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><span class="s4">\phase</span> <span class="Apple-converted-space">  </span>- generally good, especially for tonal input, medium efficiency</li>
121 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><span class="s4">\wphase</span> <span class="Apple-converted-space">  </span>- generally very good, especially for tonal input, medium efficiency</li>
122 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><span class="s4">\mkl</span><span class="Apple-converted-space">      </span>- generally very good, medium efficiency, pretty different from the other methods</li>
123 </ul>
124 <p class="p2"><br></p>
125 <p class="p3">Which of these should you choose? The differences aren't large, so I'd recommend you stick with the default <span class="s4">\rcomplex</span> unless you find specific problems with it. Then maybe try <span class="s4">\wphase</span>. The <span class="s4">\mkl</span> type is a bit different from the others so maybe try that too. They all have slightly different characteristics, and in tests perform at a similar quality level.</p>
126 <p class="p2"><br></p>
127 <p class="p3">For more details of all the processes involved, the different <i>onset detection functions</i>, and their evaluation, see</p>
128 <p class="p2"><br></p>
129 <p class="p9">D. Stowell and M. D. Plumbley. <a href="http://www.elec.qmul.ac.uk/digitalmusic/papers/2007/StowellPlumbley07-icmc.pdf"><span class="s5">Adaptive whitening for improved real-time audio onset detection</span></a>. <i>Proceedings of the International Computer Music Conference (ICMC’07)</i>, Copenhagen, Denmark, August 2007.</p>
130 <p class="p2"><br></p>
131 <p class="p2"><br></p>
132 <p class="p3"><b>Advanced features</b></p>
133 <p class="p2"><br></p>
134 <p class="p3">Further options are available, which you are welcome to explore if you want. They are numbers that modulate the behaviour of the onset detector:</p>
135 <p class="p2"><br></p>
136 <ul class="ul1">
137 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><b>relaxtime</b> and <b>floor</b> are parameters to the whitening process used, a kind of normalisation of the FFT signal. (Note: in <span class="s4">\mkl</span> mode these are not used.)</li>
138 <ul class="ul2">
139 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><b>relaxtime</b> specifies the time (in seconds) for the normalisation to "forget" about a recent onset. If you find too much re-triggering (e.g. as a note dies away unevenly) then you might wish to increase this value.</li>
140 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><b>floor</b> is a lower limit, connected to the idea of how quiet the sound is expected to get without becoming indistinguishable from noise. For some cleanly-recorded classical music with wide dynamic variations, I found it helpful to go down as far as 0.000001.</li>
141 </ul>
142 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><b>mingap</b> specifies a minimum gap (in FFT frames) between onset detections, a brute-force way to prevent too many doubled detections.</li>
143 <li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica"><b>medianspan</b> specifies the size (in FFT frames) of the median window used for smoothing the detection function before triggering.</li>
144 </ul>
145 </body>
146 </html>