I read a few books and articles about Convolutional neural network, it seems I understand the concept but I don’t know how to put it up like in image below:

(source: what-when-how.com)
from 28×28 normalized pixel INPUT we get 4 feature maps of size 24×24. but how to get them ? resizing the INPUT image ? or performing image transformations? but what kind of transformations? or cutting the input image into 4 pieces of size 24×24 by 4 corner? I don’t understand the process, to me it seem they cut up or resize the image to smaller images at each step. please help thanks.
This is matlab help file for CONV2 function, which use in CNN Matlab (to get convolutional layers). Read it carefully and you will see your answer.