My understanding was that arrays were simply constant pointers to a sequence of values, and when you declared an array in C, you were declaring a pointer and allocating space for the sequence it points to.
But this confuses me: the following code:
char y[20];
char *z = y;
printf("y size is %lu\n", sizeof(y));
printf("y is %p\n", y);
printf("z size is %lu\n", sizeof(z));
printf("z is %p\n", z);
when compiled with Apple GCC gives the following result:
y size is 20
y is 0x7fff5fbff930
z size is 8
z is 0x7fff5fbff930
(my machine is 64 bit, pointers are 8 bytes long).
If ‘y’ is a constant pointer, why does it have a size of 20, like the sequence of values it points to? Is the variable name ‘y’ replaced by a memory address during compilation time whenever it is appropiate? Are arrays, then, some sort of syntactic sugar in C that is just translated to pointer stuff when compiled?
Here’s the exact language from the C standard (n1256):
The important thing to remember here is that there is a difference between an object (in C terms, meaning something that takes up memory) and the expression used to refer to that object.
When you declare an array such as
the object designated by the expression
ais an array (i.e., a contiguous block of memory large enough to hold 10intvalues), and the type of the expression a is “10-element array ofint“, orint [10]. If the expressionaappears in a context other than as the operand of thesizeofor&operators, then its type is implicitly converted toint *, and its value is the address of the first element.In the case of the
sizeofoperator, if the operand is an expression of typeT [N], then the result is the number of bytes in the array object, not in a pointer to that object:N * sizeof T.In the case of the
&operator, the value is the address of the array, which is the same as the address of the first element of the array, but the type of the expression is different: given the declarationT a[N];, the type of the expression&aisT (*)[N], or pointer to N-element array of T. The value is the same asaor&a[0](the address of the array is the same as the address of the first element in the array), but the difference in types matters. For example, given the codeyou’ll see output on the order of
IOW, advancing
paddssizeof int(4) to the original value, whereas advancingapadds10 * sizeof int(40).More standard language:
Thus, when you subscript an array expression, what happens under the hood is that the offset from the address of the first element in the array is computed and the result is dereferenced. The expression
is equivalent to
which is equivalent to
which is equivalent to
Yes, array subscripting in C is commutative; for the love of God, never do this in production code.
Since array subscripting is defined in terms of pointer operations, you can apply the subscript operator to expressions of pointer type as well as array type:
Here’s a handy table to remember some of these concepts:
Declaration: T a[N]; Expression Type Converts to Value ---------- ---- ------------ ----- a T [N] T * Address of the first element in a; identical to writing &a[0] &a T (*)[N] Address of the array; value is the same as above, but the type is different sizeof a size_t Number of bytes contained in the array object (N * sizeof T) *a T Value at a[0] a[i] T Value at a[i] &a[i] T * Address of a[i] Declaration: T a[N][M]; Expression Type Converts to Value ---------- ---- ------------ ----- a T [N][M] T (*)[M] Address of the first subarray (&a[0]) &a T (*)[N][M] Address of the array (same value as above, but different type) sizeof a size_t Number of bytes contained in the array object (N * M * sizeof T) *a T [M] T * Value of a[0], which is the address of the first element of the first subarray (same as &a[0][0]) a[i] T [M] T * Value of a[i], which is the address of the first element of the i'th subarray &a[i] T (*)[M] Address of the i-th subarray; same value as above, but different type sizeof a[i] size_t Number of bytes contained in the i'th subarray object (M * sizeof T) *a[i] T Value of the first element of the i'th subarray (a[i][0]) a[i][j] T Value at a[i][j] &a[i][j] T * Address of a[i][j] Declaration: T a[N][M][O]; Expression Type Converts to ---------- ---- ----------- a T [N][M][O] T (*)[M][O] &a T (*)[N][M][O] *a T [M][O] T (*)[O] a[i] T [M][O] T (*)[O] &a[i] T (*)[M][O] *a[i] T [O] T * a[i][j] T [O] T * &a[i][j] T (*)[O] *a[i][j] T a[i][j][k] TFrom here, the pattern for higher-dimensional arrays should be clear.
So, in summary: arrays are not pointers. In most contexts, array expressions are converted to pointer types.