I have a doubt in accessing some invalid data. How can the OS cause a segmentation fault for a scenario like this:
Suppose a chunk of data is 100 bytes long, aligned at the beginning of a 4K page. If we access the valid data within the first 100 bytes of the page, this will load the page into memory, and put the page table entry is in TLB. If we now try to access some invalid data between the 100 and 4K, since the entry is there in page table already, will we be allowed to access the invalid data?
That’s correct. But typically you’re not allocating memory directly from the operating. You usually allocate it via some library function (
newormalloc, etc). The library function will take the 4KB (usually it allocates more than 4KB in one chunk, too) and splits it up into the actual chunks that you ask for. So usually when you ask for 100 bytes of memory, that 100 bytes will be “wedged” in between two other allocation requests that you’ve made.This is why it’s “undefined behaviour” when you access data off the end of an array: you might get a segmentation fault, you might trash some other variable that happens to be stored there, or you might be OK and it actually works (for a while at least).