1 .\" Copyright (c) 2000-2001 John H. Baldwin <jhb@FreeBSD.org>
2 .\" All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
13 .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR
14 .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
15 .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
16 .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT,
17 .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
18 .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
19 .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
20 .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
21 .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
22 .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
34 .Nm roundrobin_interval ,
40 .Nd perform round-robin scheduling of runnable processes
45 .Fn curpriority_cmp "struct proc *p"
47 .Fn maybe_resched "struct thread *td"
49 .Fn propagate_priority "struct proc *p"
51 .Fn resetpriority "struct ksegrp *kg"
53 .Fn roundrobin "void *arg"
55 .Fn roundrobin_interval "void"
57 .Fn sched_setup "void *dummy"
59 .Fn schedclock "struct thread *td"
61 .Fn schedcpu "void *arg"
63 .Fn setrunnable "struct thread *td"
65 .Fn updatepri "struct thread *td"
67 Each process has three different priorities stored in
76 member is the user priority of the process calculated from a process'
77 estimated CPU time and nice level.
81 member is the saved priority used by
82 .Fn propagate_priority .
83 When a process obtains a mutex, its priority is saved in
85 While it holds the mutex, the process's priority may be bumped by another
86 process that blocks on the mutex.
87 When the process releases the mutex, then its priority is restored to the
93 member is the actual priority of the process and is used to determine what
95 it runs on, for example.
99 function compares the cached priority of the currently running process with
102 If the currently running process has a higher priority, then it will return
103 a value less than zero.
104 If the current process has a lower priority, then it will return a value
106 If the current process has the same priority as
111 The cached priority of the currently running process is updated when a process
114 or returns to userland in
116 and is stored in the private variable
121 function compares the priorities of the current thread and
125 has a higher priority than the current thread, then a context switch is
131 .Fn propagate_priority
132 looks at the process that owns the mutex
135 That process's priority is bumped to the priority of
138 If the process is currently running, then the function returns.
139 If the process is on a
141 then the process is moved to the appropriate
143 for its new priority.
144 If the process is blocked on a mutex, its position in the list of
145 processes blocked on the mutex in question is updated to reflect its new
147 Then, the function repeats the procedure using the process that owns the
148 mutex just encountered.
149 Note that a process's priorities are only bumped to the priority of the
152 not to the priority of the previously encountered process.
156 function recomputes the user priority of the ksegrp
162 to force a reschedule of each thread in the group if needed.
166 function is used as a
168 function to force a reschedule every
173 .Fn roundrobin_interval
174 function simply returns the number of clock ticks in between reschedules
177 Thus, all it does is return the current value of
184 that is called to start the callout driven scheduler functions.
189 functions for the first time.
190 After the initial call, the two functions will propagate themselves by
191 registering their callout event again at the completion of the respective
196 function is called by
198 to adjust the priority of the currently running thread's ksegrp.
199 It updates the group's estimated CPU time and then adjusts the priority via
204 function updates all process priorities.
205 First, it updates statistics that track how long processes have been in various
207 Secondly, it updates the estimated CPU time for the current process such
208 that about 90% of the CPU usage is forgotten in 5 * load average seconds.
209 For example, if the load average is 2.00,
210 then at least 90% of the estimated CPU time for the process should be based
211 on the amount of CPU time the process has had in the last 10 seconds.
212 It then recomputes the priority of the process and moves it to the
216 Thirdly, it updates the %CPU estimate used by utilities such as
220 so that 95% of the CPU usage is forgotten in 60 seconds.
221 Once all process priorities have been updated,
225 to update various other statistics including the load average.
226 Finally, it schedules itself to run again in
232 function is used to change a process's state to be runnable.
233 The process is placed on a
235 if needed, and the swapper process is woken up and told to swap the process in
236 if the process is swapped out.
237 If the process has been asleep for at least one run of
241 is used to adjust the priority of the process.
245 function is used to adjust the priority of a process that has been asleep.
246 It retroactively decays the estimated CPU time of the process for each
248 event that the process was asleep.
251 to adjust the priority of the process.
260 variable really should be per-CPU.
263 should compare the priority of
265 with that of each CPU, and then send an IPI to the processor with the lowest
266 priority to trigger a reschedule if needed.
268 Priority propagation is broken and is thus disabled by default.
271 variable is only updated if a process does not obtain a sleep mutex on the
273 Also, if a process obtains more than one sleep mutex in this manner, and
274 had its priority bumped in between, then