So I was going through K&R second edition doing the exercises. Feeling pretty confident after doing few exercises I thought I’d check the actual implementations of these functions. It was then my confidence fled the scene. I could not understand any of it.
For example I check the getchar():
Here is the prototype in libio/stdio.h
extern int getchar (void);
So I follow it through it and gets this:
__STDIO_INLINE int
getchar (void)
{
return _IO_getc (stdin);
}
Again I follow it to the libio/getc.c:
int
_IO_getc (fp)
FILE *fp;
{
int result;
CHECK_FILE (fp, EOF);
_IO_acquire_lock (fp);
result = _IO_getc_unlocked (fp);
_IO_release_lock (fp);
return result;
}
And I’m taken to another header file libio/libio.h, which is pretty cryptic:
#define _IO_getc_unlocked(_fp) \
(_IO_BE ((_fp)->_IO_read_ptr >= (_fp)->_IO_read_end, 0) \
? __uflow (_fp) : *(unsigned char *) (_fp)->_IO_read_ptr++)
Which is where I finally ended my journey.
My question is pretty broad. What does all this mean? I could not for the life of me figure out anything logical out of it by looking at the code. Looks like a bunch of codes abstracted away layers after layer.
More importantly when does it really get the character from stdin
_IO_getc_unlockedis an inlinable macro. The idea is that you can get a character from the stream without having to call a function, making it hopefully fast enough to use in tight loops, etc.Let’s take it apart one layer at a time. First, what is
_IO_BE?_IO_BE is a hint to the compiler, that
exprwill usually evaluate tores. It’s used to structure code flow to be faster when the expectation is true, but has no other semantic effect. So we can get rid of that, leaving us with:Let’s turn this into an inline function for clarity:
In short, we have a pointer into a buffer, and a pointer to the end of the buffer. We check if the pointer is outside the buffer; if not, we increment it and return whatever character was at the old value. Otherwise we call
__uflowto refill the buffer and return the newly read character.As such, this allows us to avoid the overhead of a function call until we actually need to do IO to refill the input buffer.
Keep in mind that standard library functions can be complicated like this; they can also use extensions to the C language (such as
__builtin_expect) that are NOT standard and may NOT work on all compilers. They do this because they need to be fast, and because they can make assumptions about what compiler they’re using. Generally speaking your own code should not use such extensions unless absolutely necessary, as it’ll make porting to other platforms more difficult.