It is possible to use sched_setaffinity to pin a thread to a cpu, increasing performance (in some situations)
From the linux man page:
Restricting a process to run on a single CPU also avoids the
performance cost caused by the cache invalidation that occurs when a
process ceases to execute on one CPU and then recommences execution on
a different CPU
Further, if I desire a more real-time response, I can change the scheduler policy for that thread to SCHED_FIFO, and up the priority to some high value (up to sched_get_priority_max), meaning the thread in question should always pre-empt any other thread running on its cpu when it becomes ready.
However, at this point, the thread running on the cpu which the real-time thread just pre-empted will possibly have evicted much of the real-time thread’s level-1 cache entries.
My questions are as follows:
- Is it possible to prevent the scheduler from scheduling any threads onto a given cpu? (eg: either hide the cpu completely from the scheduler, or some other way)
- Are there some threads which absolutely have to be able to run on that cpu? (eg: kernel threads / interrupt threads)
- If I need to have kernel threads running on that cpu, what is a reasonable maximum priority value to use such that I don’t starve out the kernel threads?
The answer is to use cpusets. The python cpuset utility makes it easy to configure them.
Basic concepts
3 cpusets
root: present in all configurations and contains all cpus (unshielded)system: contains cpus used for system tasks – the ones which need to run but aren’t “important” (unshielded)user: contains cpus used for “important” tasks – the ones we want to run in “realtime” mode (shielded)The
shieldcommand manages these 3 cpusets.During setup it moves all movable tasks into the unshielded cpuset (
system) and during teardown it moves all movable tasks into therootcpuset.After setup, the subcommand lets you move tasks into the shield (
user) cpuset, and additionally, to move special tasks (kernel threads) fromroottosystem(and therefore out of theusercpuset).Commands:
First we create a shield. Naturally the layout of the shield will be machine/task dependent. For example, say we have a 4-core non-NUMA machine: we want to dedicate 3 cores to the shield, and leave 1 core for unimportant tasks; since it is non-NUMA we don’t need to specify any memory node parameters, and we leave the kernel threads running in the
rootcpuset (ie: across all cpus)Some kernel threads (those which aren’t bound to specific cpus) can be moved into the
systemcpuset. (In general it is not a good idea to move kernel threads which have been bound to a specific cpu)Now let’s list what’s running in the shield (
user) or unshielded (system) cpusets: (-vfor verbose, which will list the process names) (add a 2nd-vto display more than 80 characters)If we want to stop the shield (teardown)
Now let’s execute a process in the shield (commands following
'--'are passed to the command to be executed, not tocset)If we already have a running process which we want to move into the shield (note we can move multiple processes by passing a comma separated list, or ranges (any process in the range will be moved, even if there are gaps))
Advanced concepts
cset set/proc– these give you finer control of cpusetsSet
Create, adjust, rename, move and destroy cpusets
Commands
Create a cpuset, using cpus 1-3, use NUMA node 1 and call it “my_cpuset1”
Change “my_cpuset1” to only use cpus 1 and 3
Destroy a cpuset
Rename an existing cpuset
Create a hierarchical cpuset
List existing cpusets (depth of level 1)
List existing cpuset and its children
List all existing cpusets
Proc
Manage threads and processes
Commands
List tasks running in a cpuset
Execute a task in a cpuset
Moving a task
Moving a task and all its siblings
Move all tasks from one cpuset to another
Move unpinned kernel threads into a cpuset
Forcibly move kernel threads (including those that are pinned to a specific cpu) into a cpuset (note: this may have dire consequences for the system – make sure you know what you’re doing)
Hierarchy example
We can use hierarchical cpusets to create prioritised groupings
systemcpuset with 1 cpu (0)prio_lowcpuset with 1 cpu (1)prio_metcpuset with 2 cpus (1-2)prio_highcpuset with 3 cpus (1-3)prio_allcpuset with all 4 cpus (0-3) (note this the same as root; it is considered good practice to keep a separation from root)To achieve the above you create prio_all, and then create subset prio_high under prio_all, etc