In the example from, “Advance Programming in the Unix Environment” the following sample program creates a file, then uses lseek to move the file pointer to a further address thus placing a “hole” in the file. The author says the space in between is filled with “0’s”. I wanted to see if those “0’s” would print out. So I modified the program slightly. However I noticed that only the valid characters were writen to the file.
My question is how does the Unix/Linux filesystem manager know not to print the bytes in between?
#include "apue.h"
#include <fcntl.h>
#include <unistd.h>
char buf1[] = "abcdefghij";
char buf2[] = "ABCDEFGHIJ";
char buf3[10];
int
main(void)
{
int fd;
if ((fd = creat("file.hole", FILE_MODE)) < 0) {
err_sys("creat error");
}
if (write(fd, buf1, 10) != 10) { /* offset is now = 10 */
err_sys("buf1 write error");
}
if (lseek(fd, 16380, SEEK_SET) == -1) { /* offset now = 16380 */
err_sys("lseek error");
}
if (write(fd, buf2, 10) != 10) { /* offset now = 16390 */
err_sys("buf2 write error");
}
close(fd);
if ((fd = open("file.hole", O_RDWR)) == -1) {
err_sys("failed to re-open file");
}
ssize_t n;
ssize_t m;
while ((n = read(fd, buf3, 10)) > 0) {
if ((m = write(STDOUT_FILENO, buf3, 10)) != 10) {
err_sys("stdout write error");
}
}
if (n == -1) {
err_sys("buf3 read error");
}
exit(0);
}
The character
\000has a null-width display representation. It is printed, but its printing is invisible. Not every codepoint is a character. In the same way,\nis printed as a newline, not as a character.