[PATCH] Staircase cpu scheduler 2.6.8-rc2-mm1
From: Con Kolivas (kernel_at_kolivas.org)
Date: 07/31/04
- Previous message: Hiroshi Miura: "[PATCH][ACPI] Panasonic Hotkey Driver"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sun, 01 Aug 2004 00:20:04 +1000 To: linux kernel mailing list <linux-kernel@vger.kernel.org>
This is a rewrite of the scheduler policy for 2.6, based on the current
O(1) scheduler. It removes a lot of complexity and replaces it with an
obvious and simple design
Aims:
- Simple predictable design.
- Good scalability.
- Interactive by design rather than have complicated interactivity
heuristics
- Maintain fairer cpu distribution with load
- Low scheduling latency for SCHED_NORMAL tasks.
- Low overhead.
- Making renicing processes actually matter for CPU distribution
(nice 0 gets 19 times what nice +19 gets)
- Easily configured for different workloads (compute server,
non-interactive)
Summary of design:
A descending foreground-background single runqueue per cpu priority
array
Details:
This patch takes advantage of the existing infrastructure but has no
expired array. Real time tasks are treated simiilary as the current
fixed priority & timeslice system. The details are in the management
of normal policy (SCHED_NORMAL) tasks.
Normal policy tasks have a dynamic priority that drops by one every
RR_INTERVAL which equals 10ms. Once they drop to lowest priority they
are requeued at their top dynamic priority again.
Every time a task sleeps it gains "burst" which allows them to stay at
their highest priority for an extra interval per burst before dropping
priority.
The sub-jiffy case is handled specially to prevent it fooling this
system, by continuing their current timeslice next wakeup. Tasks that
have just forked are also treated the same way.
Yield()ing is implemented by putting tasks to the end of the lowest
priority queue, but once scheduled the task wakes up to it's normal
first wake up priority.
Use a fixed font to see clearly:
ie
RR_INTERVALs nice 0 task
20<-------------->40 PRI
11111111111111111111
21111111111111111110
31111111111111111100
and so on
nice +10 task
30<---->40 PRI
1111111111
2111111110
3111111100
and so on
This is how cpu distribution is kept proportional while optimising
latency for interactive tasks (which wake frequently).
Features:
This scheduler has coded in 2 simple sysctl configuration options.
/proc/sys/kernel/interactive
is a sysctl which when set to 0 disables the burst mechanism entirely
making all tasks descend the staircase in a more rigid manner and thus
maintaining much stricter cpu distribution (ie suitable to multi-user
setups, servers and non gui machines.) Since the design remains
foreground-background descending interactivity is maintained but
response becomes slower/proportional to load.
/proc/sys/kernel/compute
is a sysctl which puts the cpu scheduler in "computational" mode which
is designed for optimal cpu throughput at the expense of latency. It
enlarges timeslices to 10 times larger than default and introduces
delayed preemption of normal tasks to optimise cpu cache utilisation.
This mode effectively sets interactive to 0 as well.
With thanks to William Lee Irwin III and Zwane Mwaikambo for patches and
debugging and Peter Williams for useful suggestions.
Patch against 2.6.8-rc2-mm1
Signed-off-by: Con Kolivas <kernel@kolivas.org>
fs/proc/array.c | 4
include/linux/sched.h | 15
include/linux/sysctl.h | 2
init/main.c | 1
kernel/exit.c | 1
kernel/sched.c | 865
++++++++++++++-----------------------------------
kernel/sysctl.c | 16
7 files changed, 283 insertions(+), 621 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- text/x-fortran attachment: from_2.6.8-rc2-mm1_to_staircase7.F
- application/pgp-signature attachment: OpenPGP digital signature
- Previous message: Hiroshi Miura: "[PATCH][ACPI] Panasonic Hotkey Driver"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
- [PATCH] 2.6.0-test4 scheduler policy
... Please find attached my CPU scheduler changes against test4. ... bad priority
distribution (eg. may cause ... * Minimum timeslice is 10 msecs, default timeslice
is 100 msecs, ... * This part scales the interactivity limit depending on niceness. ...
(Linux-Kernel) - Re: Circular Convolution scheduler
... values for how fast to scale up the priority ... on a process that has matched
the heuristics, ... scheduler when we are requeued. ... interactivity
heuristic). ... (Linux-Kernel) - [CFT][PATCH] new scheduler policy
... remember the scores of scheduler "exploits" thrown around. ... * they expire.
... * This part scales the interactivity limit depending on niceness. ... * priority
range a task can explore, a value of '1' means the ... (Linux-Kernel) - [PATCH] Staircase Scheduler v6.3 for 2.6.7-rc2
... original design require a detailed description. ... - Interactive by design
rather than have interactivity bolted on. ... - Maintain appropriate cpu distribution
and fairness. ... - All tasks start at a dynamic priority based on their nice value. ...
(Linux-Kernel) - [RFC][PATCH] O(1) Entitlement Based Scheduler
... This patch is a modification of the Oscheduler that introduces ... _entitlement_
to CPU resources that is determined by the number of _shares_ ... This patch provides both
soft and hard CPU usage rate caps per ... one getting the most can be given a better priority,
... (Linux-Kernel)