I’m trying to write a daemon that will start as root using a setuid bit, but then quickly revert to the user running the process. The daemon, however needs to retain the ability to set new threads to “realtime” priority. The code that I’m using to set the priority is as follows (runs in a thread once it is created):
struct sched_param sched_param;
memset(&sched_param, 0, sizeof(sched_param));
sched_param.sched_priority = 90;
if(-1 == sched_setscheduler(0, SCHED_FIFO, &sched_param)) {
// If we get here, we have an error, for example "Operation not permitted"
}
However the part I’m having problems with is setting the uid, while retaining the ability to make the above call to sched_setscheduler.
I have some code that runs close to startup in the main thread of my application:
if (getgid() != getegid() || getuid() != geteuid()) {
cap_value_t cap_values[] = {CAP_SYS_NICE};
cap_t caps;
caps = cap_get_proc();
cap_set_flag(caps, CAP_PERMITTED, 1, cap_values, CAP_SET);
cap_set_proc(caps);
prctl(PR_SET_KEEPCAPS, 1, 0, 0, 0);
cap_free(caps);
setegid(getgid());
seteuid(getuid());
}
The problem is that after running this code, I get “Operation not permitted” when calling sched_setscheduler as alluded to in the comment above. What am I doing wrong?
Edited to describe the reason for the original failure:
There are three sets of capabilities in Linux: inheritable, permitted, and effective. Inheritable defines which capabilities stay permitted across an
exec(). Permitted defines which capabilities are permitted for a process. Effective defines which capabilities are currently in effect.When changing the owner or group of a process from root to non-root, the effective capability set is always cleared.
By default, also the permitted capability set is cleared, but calling
prctl(PR_SET_KEEPCAPS, 1L)before the identity change tells the kernel to keep the permitted set intact.After the process has changed the identity back to the unprivileged user, the
CAP_SYS_NICEmust be added to the effective set. (It must also be set in the permitted set, so if you clear your capability set, remember to set it also. If you just modify the current capability set, then you know it is already set because you inherited it.)Here is the procedure I recommend you should follow:
Save real user ID, real group ID, and supplemental group IDs:
Filter out unnecessary and privileged supplementary groups (be paranoid!)
Because you cannot “clear” the supplementary group IDs (that just requests the current number), make sure the list is never empty. You can always add the real group ID to the supplementary list to make it non-empty.
Switch real and effective user IDs to root
Set the
CAP_SYS_NICEcapability in theCAP_PERMITTEDset.I prefer to clear the entire set, and only keep the four capabilities that are required for this approach to work (and later on, drop all but CAP_SYS_NICE):
Tell the kernel you wish to retain the capabilities over the change from root to the unprivileged user; by default, the capabilities are cleared to zero when changing from root to non-root identity
Set real, effective, and saved group IDs to the initially saved group ID
Set supplemental group IDs
Set real, effective and saved user IDs to the initially saved user ID
At this point you effectively drop root privileges (without the ability to gain them back anymore), except for the
CAP_SYS_NICEcapability. Due to the transition from root to non-root user, the capability is never effective; the kernel will always clear the effective capability set on such a transition.Set the
CAP_SYS_NICEcapability in theCAP_PERMITTEDandCAP_EFFECTIVEsetNote that the latter two
cap_set_flag()operations clear the three capabilities no longer needed, so that only the first one,CAP_SYS_NICEremains.At this point the capabilities’ descriptor is no longer needed, so it’s a good idea to free it.
Tell the kernel you don’t wish to retain the capability over any further changes from root (again, just paranoia)
This works on x86-64 using GCC-4.6.3, libc6-2.15.0ubuntu10.3, and linux-3.5.0-18 kernel on Xubuntu 12.04.1 LTS, after installing the
libcap-devpackage.Edited to add:
You can simplify the process by relying only on the effective user ID being root, as the executable is setuid root. In that case, you don’t need to worry about the supplementary groups either, as the setuid root only affects the effective user ID and nothing else. Returning back to the original real user, you technically only need the one
setresuid()call at the end of the procedure (and thesetresgid()if the executable also happens to be marked setgid root), to set both saved and effective user (and group) IDs to the real user.However, the case where you regain the original users’ identity is rare, and the case where you gain the identity of a named user is common, and this procedure here was originally designed for the latter. You would use
initgroups()to gain the correct supplementary groups for the named user, and so on. In that case, taking care of the real, effective, and saved user and group IDs and supplementary group IDs this carefully is important, as otherwise the process would inherit supplementary groups from the user that executed the process.The procedure here is paranoid, but paranoia is not a bad thing when you are dealing with security-sensitive issues. For the revert-back-to-real-user case, it can be simplified.
Edited on 2013-03-17 to show a simple test program. This assumes it is installed setuid root, but it will drop all privileges and capabilities (except CAP_SYS_NICE, which is required for scheduler manipulation above the normal rules). I pared down the “excess” operations I prefer to do, in the hopes that others find this easier to read.
Note that if you know the binary is only run on relatively recent Linux kernels, you can rely on file capabilities. Then, your
main()needs none of the identity or capability manipulation — you can remove everything inmain()except thetest_priority()functions –, and you just give your binary, say./testprio, the CAP_SYS_NICE priority:You can run
getcapto see which priorities are granted when a binary is executed:which should display
File capabilities seem to be little used thus far. On my own system,
gnome-keyring-daemonis the only one with file capabilities (CAP_IPC_LOCK, for locking memory).