Linux 2.6.39 introduced O_PATH open mode, which (roughly speaking) doesn’t really open the file at all (i.e. doesn’t create an open file description), but just gives a file descriptor that’s a handle to the unopened target. Its main use is as an argument to the *at functions (openat, etc.), and it seems to be suitable as an implementation of the POSIX 2008 O_SEARCH functionality which Linux was previously missing. However, I’ve been unable to find any good documentation on the exact semantics of O_PATH. A couple specific questions I have are:
- What operations are possible on Linux
O_PATHfile descriptors? (Only*atfunctions?) - Is
O_PATHever useful with non-directories? - How is the file descriptor bound to the underlying filesystem object, and what happens if it’s moved, deleted, etc.? Does an
O_PATHfile descriptor count as a reference that prevents the object from being freed when the last link is unlinked? Etc.
File descriptors obtained using
open(directory, O_PATH | O_DIRECTORY)are not only useful for...at()functions, but forfchdir()(since kernel version 3.2.23, I believe).There is also a recent patch for a new syscall,
fbind(), that would allow very long Unix domain socket names. The socket file is first created usingmknod(path, mode | S_IFSOCK, (dev_t)0), then opened usingopen(file, O_PATH). The file descriptor thus obtained, and a Unix domain socket descriptor, is passed tofbind(), to bind the socket to the pathname. Whether this will be included in the Linux kernel is yet to be seen — although even if it is, it will be years before one can rely on it being universally available. (As a workaround for too-long Unix domain socket names it would be viable sooner, though.)I’d say
O_PATHis only useful for directories for now; file uses may be found in the future. Other than the possibility of a futurefbind(), or similar future syscalls, I don’t know of any use of file descriptors for files opened usingO_PATH. Evenfstatvfs()won’t work, on a 3.5.0 kernel at least.In Linux, inodes (file contents and metadata) are freed only when the last open file descriptor is closed. When removing (unlinking) a file, you only remove the file name associated with the inode. So, there are two separate filesystem objects associated with a file descriptor: the name used to open the object, and the underlying inode referred to. The name is only used for path resolution, i.e. when
open()(or equivalent) is called. All data and metadata is in the inode.File descriptors obtained using
O_PATHbehave (at least on kernel 3.5.0) just like normal file descriptors wrt. moving and renaming the name or name components used to open the descriptor. (The descriptor stays valid, as it refers to the inode, and the file name object was used only during path resolution. Holding the descriptor open will keep the inode resources allocated, even if the descriptor was openedO_PATH.)