I am implementing my own system call in linux. It is calling the rename

Question

0

Asked: June 13, 20262026-06-13T00:27:23+00:00 2026-06-13T00:27:23+00:00

I am implementing my own system call in linux. It is calling the rename

0

I am implementing my own system call in linux. It is calling the rename system call inside it. It uses a user argument (below is the code) to pass the code to the rename.

Here is the basic code:

int sys_mycall(const char __user * inputFile)   {

//
// Code to generate my the "fileName"
//
//

old_fs = get_fs();
set_fs(KERNEL_DS);

    ans =  sys_renameat(AT_FDCWD, fileName, AT_FDCWD, inputFile);

set_fs(old_fs);

    return ans;

}

I have two doubts here.

I am using the old_fs = get_fs();,set_fs(KERNEL_DS); and set_fs(old_fs); to hack around the actual call to sys_rename because there was an error. I got the answer from this question: allocate user-space memory from kernel … Is this a right work around?
How to call otherwise a system call from a system call

EDIT:

int sys_myfunc(const char __user * inputFileUser)   {


    char inputFile[255];
    int l = 0;
    while(inputFileUser[l] != '\0') l++;

    if(l==0)
        return -10; 

    if(copy_from_user(inputFile,inputFileUser,l+1)< 0 ) return -20;
//
//GENERATE fileName here
//
//

    char fileName[255];
    return  sys_renameat(AT_FDCWD, inputFile, AT_FDCWD, fileName);

}

The following still returns -1. Why? I copied the data to kernel space.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T00:27:25+00:00

I wanted to show exactly how the correct way to achieve what footy wants, but my original answer grew too long, I decided to put the solution in a separate answer. I’ll split the code into parts, and explain what each fragment does.

Remember that since we reuse kernel code, the code in this post and the resulting function must be licensed under the GPLv2 license.

First, we start by declaring a one-parameter syscall.

SYSCALL_DEFINE1(myfunc, const char __user *, oldname)
{

In the kernel, stack space is a scarce resource. You do not create local arrays; you always use dynamic memory management. Fortunately, there are some very useful functions like __getname(), so it is very little additional code. The important thing is to remember to release whatever memory you use when you are done with it.

As this syscall is basically a variant of rename, we reuse almost all of the fs/namei.c:sys_renameat() code. First, the local variable declarations. There are a lot, too; as I said, stack is scarce in kernel, and you won’t see much more local variables than this in any syscall function:

    struct dentry *old_dir, *new_dir;
    struct dentry *old_dentry, *new_dentry;
    struct dentry *trap;
    struct nameidata oldnd, newnd;
    char *from;
    char *to = __getname();
    int error;

The first change to the sys_renameat() is on the char *to = __getname(); line above, already. It allocates PATH_MAX+1 bytes dynamically, and must be released using __putname() after it is no longer needed. This is the correct way to declare a temporary buffer for a file or directory name.

To construct the new path (to), we also need to be able to access the old name (from) directly. Because of the kernel-userspace barrier, we cannot just access oldname directly. So, we create an in-kernel copy of it:

    from = getname(oldname);
    if (IS_ERR(from)) {
        error = PTR_ERR(from);
        goto exit;
    }

Although many C programmers have been taught that goto is evil, this is the exception: error handling. Instead of having to remember all the cleanup we need to do (and we already need to do __putname(to) at minimum), we put the cleanup at the end of the function, and skip to the correct point, exit being the last one. error holds the error number, of course.

At this point of our function, we can access from[0] up to the first '\0', or up to (and including) from[PATH_MAX], whichever is first. It is a normal kernel-side data, and is accessed in the ordinary fashion you would in any C code.

You also have reserved the memory for the new name as to[0] up to and including to[PATH_MAX]. Remember to make sure it too is terminated using \0 (in to[PATH_MAX] = '\0' or an earlier index).

After constructing the contents for to, we need to do the path lookups. Unlike renameat(), we cannot use user_path_parent(). We can, however, look at what user_path_parent() does, and do the same work — adapting to our own needs, of course. It turns out it just calls do_path_lookup() with error checking. So, the two user_path_parent() calls and their error checks can be replaced with

    error = do_path_lookup(AT_FDCWD, from, LOOKUP_PARENT, &oldnd);
    if (error)
        goto exit0;

    error = do_path_lookup(AT_FDCWD, to, LOOKUP_PARENT, &newnd);
    if (error)
        goto exit1;

Note that exit0 is a new label not found in the original renameat(). We need a new label because at exit, we only have to; but at exit0, we have both to and from. After exit0, we have to, from, and oldnd, and so on.

Next, we can reuse the bulk of sys_renameat(). It does all the hard work at renaming. To conserve space, I’ll omit my ramblings on exactly what it does, since you can trust that if rename() works, it’ll work too.

    error = -EXDEV;
    if (oldnd.path.mnt != newnd.path.mnt)
        goto exit2;

    old_dir = oldnd.path.dentry;
    error = -EBUSY;
    if (oldnd.last_type != LAST_NORM)
        goto exit2;

    new_dir = newnd.path.dentry;
    if (newnd.last_type != LAST_NORM)
        goto exit2;

    error = mnt_want_write(oldnd.path.mnt);
    if (error)
        goto exit2;

    oldnd.flags &= ~LOOKUP_PARENT;
    newnd.flags &= ~LOOKUP_PARENT;
    newnd.flags |= LOOKUP_RENAME_TARGET;

    trap = lock_rename(new_dir, old_dir);

    old_dentry = lookup_hash(&oldnd);
    error = PTR_ERR(old_dentry);
    if (IS_ERR(old_dentry))
        goto exit3;
    /* source must exist */
    error = -ENOENT;
    if (!old_dentry->d_inode)
        goto exit4;
    /* unless the source is a directory trailing slashes give -ENOTDIR */
    if (!S_ISDIR(old_dentry->d_inode->i_mode)) {
        error = -ENOTDIR;
        if (oldnd.last.name[oldnd.last.len])
            goto exit4;
        if (newnd.last.name[newnd.last.len])
            goto exit4;
    }
    /* source should not be ancestor of target */
    error = -EINVAL;
    if (old_dentry == trap)
        goto exit4;
    new_dentry = lookup_hash(&newnd);
    error = PTR_ERR(new_dentry);
    if (IS_ERR(new_dentry))
        goto exit4;
    /* target should not be an ancestor of source */
    error = -ENOTEMPTY;
    if (new_dentry == trap)
        goto exit5;

    error = security_path_rename(&oldnd.path, old_dentry,
                     &newnd.path, new_dentry);
    if (error)
        goto exit5;

    error = vfs_rename(old_dir->d_inode, old_dentry,
                   new_dir->d_inode, new_dentry);

At this point, all the work has been done, and only releasing the locks, memory, and so on taken by the code above, is left. If everything was successful at this point, error == 0, and we do all cleanup. If we had a problem, error contains the error code, and we have jumped to the correct label to do the cleanup necessary to the point where the error occurred. If the vfs_rename() failed — it does the actual operation –, we doo all cleanup.

However, compared to the original code, we got hold of from very first (exit), to just after (exit0), followed by the dentry lookups. So, we need to move releasing them to their correct locations (near the very end, since they were done first. Cleanups occur, of course, in the reverse order):

exit5:
    dput(new_dentry);
exit4:
    dput(old_dentry);
exit3:
    unlock_rename(new_dir, old_dir);
    mnt_drop_write(oldnd.path.mnt);
exit2:
    path_put(&newnd.path);
exit1:
    path_put(&oldnd.path);
exit0:
    putname(from);
exit:
    __putname(to);
    return error;
}

And here we are done.

Of course, there are a lot of details to consider above in the parts we copied from sys_renameat() — and like I said in the other answer, you should not just copy code like this, but refactor the common code into a helper function; that makes maintenance much easier. Fortunately, because we kept all the checks from renameat() — we do the path manipulation before any of the renameat() code was copied — we can be sure that all the necessary checks are done. It’s just as if the user specified the manipulated path herself and called renameat().

If you were to do the modification after some checks have already been done, the situation would be much more complicated. You would have to think what those checks are, how your modifications impact on them, and almost always, re-do those checks.

To remind any reader, the reason you cannot just create a filename or any other string in your own syscall and then call another syscall, is that your just-created string resides on the kernel side of the kernel-userspace boundary, while syscalls expect the data to reside on the other, userspace side. While on x86 you can accidentally pierce the boundary from the kernel side, it does not mean you should do so: there are copy_from_user() and copy_to_user() and their derivatives like strncpy_from_user() that must be used for this purpose. It is not a question of having to do magic to call another syscall, but about where (in-kernel, or userspace) the data supplied is.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am implementing my own system call in linux. It is calling the rename

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply