Snell-Pym

CPU scheduling (by alaric)

I think so. And my reasoning is this: you can do it by stacking schedulers. Imagine a hard real time scheduler that handles the first two task types; rejecting requests to add a new task that it's not 100% confident it can always schedule to meet the deadline of. This scheduler will have an idle task that it runs when there are no other tasks to perform - but if that idle task, rather than busy-waiting or halting the CPU, is in fact a soft real time scheduler that attempts to do its best, guided by priority, to schedule tasks within their deadlines, then you now have a system that supports both types of task, with hard real time taking absolute priority over soft real time. And if the soft scheduler then has, instead of an idle task, a prioritised best-effort scheduler, like a UNIX system, then it can support the non real time event handlers too. And the prioritised best-effort scheduler can have a lottery scheduler with two priority levels as its idle task, to support bulk tasks and idle tasks.

Although one can implement such a hybrid scheduler by literally nesting totally independent schedulers within idle tasks, to show that (assuming that, like a true idle task, they are always interruptible when needed) they can coexist in a single system without interfering, there will probably be efficiencies to be gained by having a single compound scheduler that understands the different task types directly. However, I would suspect that the efficiencies may not be worth the danger of a bug then slipping in, allowing (in the worst case) idle tasks to interfere with the hard real time scheduler - especially if all tasks were stored in the same linked list, perhaps requiring the hard real time scheduler to scan past all the tasks in the system to locate tasks of interest to it; in which case, problems could easily arise.

Now, with the hard real time tasks, the scheduler variables are all well defined. The maximum event rate, or the periodicity of a poller, are specified for us by the problem at hand, along with the deadlines. Working out how much CPU time will be needed by the handler in the worst case, however, is a bit thornier. For a start, it depends on the fine details of the node's hardware, something which the user code shouldn't have to worry about.

There are two possible approaches: Measuring the response time of the handler in a test environment with worst-case inputs specified by the programmer, as part of the installation of the task, or restricting the language used for these handlers to one in which the maximum possible running time can be deduced from knowledge of the maximum run times of the primitives. The advantage of the former is that you can have Turing-complete language; the advantage of the latter is that you don't need to specify worst case inputs and risk disaster if you fail to correctly find the worst case inputs. Perhaps both options need to be offered. Experimentation is needed.

Obviously, the programmer needs to be able to specify several possible worst case inputs; either of two sets of inputs that produce an identical path through the code except that one does fifteen multiplies while the other does ten divides may be the worst case, depending on the exact floating point hardware available.

However, on architectures that haven't been designed with real time in mind, it is hard to set up a worst case situation with system caches, prefetch queues, DRAM refresh cycle eaters and other such hidden things that may affect the running time. As such, each node would need an architecture-dependent 'worst case fudge factor' to apply to the worst case timing - and you'd have to assume that the measured timing was possible a best case scenario, with the caches in a perfect state to begin with, and make the fudge factor quite large. With all these heavily overestimated worst case timings, the hard real time scheduler won't be able to confidently run all that many tasks on these architectures.

It may be that on many architectures, hard real time just isn't supported at all. That's not really a problem; hard real time is only needed in specialist situations, and a compound scheduler made by stacking schedulers can easily be trimmed by removing a layer.

The soft real time scheduler still needs an estimate of handler CPU time requirements, however. But without the requirement to only allow tasks it can be sure it can schedule, it can use "incorrect CPU time estimates" as a valid excuse for failing slightly. But still, the dilemma remains; do we try a test run of the handler with worst case inputs, or require the handler to be statically analysable? Perhaps we can add a third option: initially assume that a task has potentially unbounded CPU time requirements, but measure them over its first few runs to get an esimate of the maximum CPU requirement.

However, how do we choose soft real time priorities? Within a given application, the programmer will be able to assign relative priorities to that application's tasks, but what happens when two soft real time applications are running on the same node?

After much thinking, I've been unable to come up with a mechanism for correctly assigning relative priorities between tasks in different applications without just coming up with a bodge. So I've decided that the administrator who is installing real time applications on a node will have to decide! Each application will need to assign names and descriptions to its tasks, as well as specifying its own constraints on the priorities of its tasks - basically, a list of rules stating that a certain task must have higher priority than a certain other task.

The administrator then gets to see a list of all soft real time tasks on the node, and can move them up and down the list, except when the constraints on tasks within an application prevent it. When you can't find a decent algorithm to do something, make the administrator do it!

However, best effort event handlers are slightly different. For a start, there isn't a static list of them associated with the ARGON node in the configuration database; they pop into existence whenever a MERCURY call comes in. So how to assign priorities, and weightings in case they become bulk tasks? So I've decided that the cluster configuration database needs to have a list of groups of entities (using the same mechanism as for specifying groups in access control lists), along with the priority and weighting to assign to the handlers of requests from that group. When there are several requests from the same group being handled by a node, then the weighting gets shared between them - the weighting is not for each task, but for all tasks owned by that group. Perhaps there should be an option in the group priorities table to choose whether the weighting is shared between the whole group, or is shared between tasks originating from each entity in the group, or is actually assigned to each individual task?

That way, one can assign the highest priority and a significant weighting to administrative user agent entities, so administrators can always get good interactive response from a node in order to be able to rescue it if it is being overloaded by something. Then good priorities and individual per-entity weightings can be given to the primary users of the cluster and, finally, a minimal priority and a single shared weighting assigned to "everyone else".

Finally, idle tasks will be specified in the configuration databases for the node and for the cluster itself; in that entry, the weighting can be specified by the administrator responsible.

There... can anyone think of any situations that combined scheduler won't be able to handle fairly?

Sarah and Alaric Snell-Pym living in interesting times

CPU scheduling (by alaric)

No Comments

Leave a comment

Search

Categories

About Us

Ada Lovelace Day

Business

Family

Fictional Friends

Friends

Mind candy

Projects

The Salaric Blogs

Archives

Meta

Snell-Pym

Sarah and Alaric Snell-Pym living in interesting times

CPU scheduling (by alaric)

No Comments

Leave a comment

Subscribe

Search

Categories

About Us

Ada Lovelace Day

Business

Family

Fictional Friends

Friends

Mind candy

Projects

The Salaric Blogs

Archives

Meta