I’m working on an assembler for a hypothetical machine (the SMAC-0 machine) and need some help with memory allocation.
I’ll be getting and tokenizing strings from a given file and will save these tokens in pointers.
Here’s a code snippet:
tokenCount = sscanf(buffer,"%s %s %s %s", tokenOne, tokenTwo, tokenThree, tokenFour);
where tokenCount is an integer, buffer is the temporary buffer that stores the line taken from the input file, and tokenOne, tokenTwo, tokenThree, and tokenFour are character pointers.
The strings accepted from the file can have one to four words:
Example:
READ N
N: DS 1
SUM: DS 1
LOOP: MOVER AREG N
ADD AREG N
COMP AREG ='5'
BC LE LOOP
MOVEM AREG SUM
PRINT SUM
STOP
My queries are:
(That question also applies to the buffer pointer, since the labels (e.g. LOOP, N, SUM) can be of variable sizes.)
scanf() or other input functions like gets(), do the same?
You should declare your token buffers large enough. To be on the safe side, it’s a good idea to make all of them as large as the input buffer itself. See this this thread How to prevent scanf causing a buffer overflow in C? for more information.
If you’re using the GNU compiler, you can make use a extension which can dynamically allocate buffers on your behalf. Check out Dynamic allocation with scanf()
EXAMPLES:
Using predefined buffers for the scanned tokens:
Note all tokens have the same size as the input buffer:
The program produces the following output (I cleaned up the formatting a little bit to improve readability):
If you want to store the scanned tokens for later processing, you’ll have to copy them somewhere else in the while-loop. You can use the function
strlento get the size of the token (excluding the trailing string terminator ‘\0’).Using dynamic memory allocation for tokens:
Like I said, you could also let scanf allocate buffers for you dynamically. The scanf(3) man page states that you can use GNU extensions ‘a’ or ‘m’ to do that. Specifically it says:
I couldn’t get scanf to work using the ‘a’ modifier. However, there’s also the ‘m’ modifier which does the same thing (and more):