I opened a file stream to a very big file using fopen. Before performing any read operation on that stream, I deleted the file using unlink(). And still, I was able to read the whole file.
I am guessing that there is a buffer associated with the stream, which holds the data of the file. But obviously that buffer will have a limit. That was the reason why I chose a_big_file whose size was 551126688 bytes or 526MB.
I want to know what is the exact reason behind it. Here is the test code that I used.
#include <stdio.h> #include <unistd.h> int main(){ FILE *fp; long long int file_size = 0; int bytes_read = 0; char buf[1]; fp = fopen('a_big_file', 'r'); unlink('a_big_file'); while(0 != (bytes_read = fread(buf, 1, 1, fp))){ file_size += bytes_read; } printf('file_size is %llu\n', file_size); return 0; }
Output: file_size is 551126688
In Unix and Unix-like operating systems, the file doesn’t actually go away until the last open file handle on it is closed. This is a very useful trick for temporary files – if you unlink it as soon as you open it, the file won’t be visible to other processes, and it will be removed from the system as soon as your program closes it, ends or crashes. That helps prevent the proliferation of orphan temp files.
Practically (glossing over some technical details here) what happens is that Unix file systems are reference counted. When you open the file, you actually get connected to the file’s inode (which is the real indication of where the actual content of the file lives). But unlinking the file just removes the directory entry, so the file doesn’t have a name any more. The file system will only reclaim the file space (ie the inode) if it isn’t in any directory entries, AND nobody has it open. The other processes can’t open it in the ordinary manner because they can’t map a file name to the inode.
Note that Unix file systems allow multiple directory entries to point to the same inode – we call that a ‘hard link’. If you do a ‘ls -l’, one of the fields is the count of hard links to that same inode, and if you do an ‘ls -li’, you can see the actual inode address.