I have a very specific application where I need an auto-increment variable with persistent storage.
To be precise, I store the decimal representation of an int variable on a file. To generate the next number, I read() from the file, convert the contents back to int, add 1 and write() back to the file. I do NOT need concurrent access to this data. Only one thread from one process calls the functions to retrieve the auto-increment number. The program runs on an embedded environment, where no-one will have access to the console, so security should not be a concern. If it matters, it runs on Linux 2.6.24 on MIPS.
The problem is, I am not getting 100% reproducible results. Sometimes I get repeated numbers, which is unacceptable for my application.
My implementation is as follows.
On starting the application, I have:
int fd = open("myfile", O_RDWR|O_CREAT|O_SYNC, S_IRWXU|S_IRWXG|S_IRWXO);
And the auto-increment functions:
int get_current(int fd)
{
char value[SIZE];
lseek(fd, 0, SEEK_SET);
read(fd, value, SIZE);
return atoi(value);
}
int get_next(int fd)
{
char value[SIZE];
int cur = get_current(fd);
memset(value, 0, SIZE);
sprintf(value, "%d", cur + 1);
lseek(fd, 0, SEEK_SET);
write(fd, value, SIZE);
//fsync(fd); /* Could inserting this be the solution? */
return (cur + 1);
}
I have intentionally left out error checking above for the sake of code readability. I have code in place to check return values of all syscalls.
The code was originally written by another person, and now that I have detected this problem, the first step to solve it is to find out what could have caused it. I am concerned that it could be related to the way file accesses are cached. I know when I write() I have no gurantee the data ever actually reached the physical medium, but is it safe to call read() without having called fsync() and still get predictable results? If it is, then I’m out of ideas 😉
Thanks for reading through.
Yes, it is safe to read immediately after writing. In a Unix-like system, the data is safely in the kernel buffer pool when a
write()returns and will be returned to other processes that need to read the data. Similar comments apply when using O_SYNC, O_DSYNC, O_FSYNC (which ensure that data is written to disk) and to Windows systems. Clearly, an asynchronous write will not be complete when theaio_write()call returns, but it will be complete when the completion is signalled.However, your problem arises because you are not ensuring that you have a single process or thread accessing the file at a time. You must ensure that you get serial access so that you don’t get two processes (or threads) reading from the file at the same time. This is the ‘lost update’ problem in DBMS terms.
You need to ensure that only one process has access at a time. If your processes cooperate, you can use advisory locking (via
fcntl()on POSIX systems). If your processes don’t cooperate, or you’re not sure, you may need to go for mandatory locking, or use some other technique altogether.