I understand that the syntax char * = “stringLiteral”; has been deprecated and may not even work in the future. What I don’t understand is WHY.
I searched the net and stack and although there are many echos confirming that char * = “stringLiteral”; is wrong and that const char * = “stringLiteral”; is corect, I have yet to find information about WHY said syntax is wrong. In other words, I’d like to know what the issue really is under the hood.
ILLUSTATING MY CONFUSION
CODE SEGMENT 1 – EVIL WAY (Deprecated)
char* szA = "stringLiteralA"; //Works fine as expected. Auto null terminated.
std::cout << szA << std::endl;
szA = "stringLiteralB"; //Works, so change by something same length OK.
std::cout << szA << std::endl;
szA = "stringLiteralC_blahblah"; //Works, so change by something longer OK also.
std::cout << szA << std::endl;
Ouput:
stringLiteralA
stringLiteralB
stringLiteralC_blahblah
So what exactly is the problem here? Seems to work just fine.
CODE SEGMENT 2 (The “OK” way)
const char* szA = "stringLiteralA"; //Works fine as expected. Auto null term.
std::cout << szA << std::endl;
szA = "stringLiteralB"; //Works, so change by something same length OK.
std::cout << szA << std::endl;
szA = "stringLiteralC_blahblah"; //Works, so change by something longer OK also.
std::cout << szA << std::endl;
Ouput:
stringLiteralA
stringLiteralB
stringLiteralC_blahblah
Also works fine. No difference. What is the point of adding const?
CODE SEGMENT 3
const char* const szA = "stringLiteralA"; //Works. Auto null term.
std::cout << szA << std::endl;
szA = "stringLiteralB"; //Breaks here. Can't reasign.
I am only illustrating here that in order to read only protect the variable content you have to const char* const szA = “something”; .
I don’t see the point for deprecation or any issues. Why is this syntax deprecated and considered an issue?
const char *is a pointer (*) to a constant (const)char(pointer definitions are easily read from right to left). The point here is to protect the content, since, as the standard says, modifying the content of such a pointer results in undefined behavior.This has its roots in the fact that typically (C/C++) compilers group the strings used throughout the program in a single memory zone, and are allowed to use the same memory locations for instances of the same string used in unrelated parts of the program (to minimize executable size/memory footprint). If it was allowed to modify string literals you could affect with one change other, unrelated instances of the same literal, which obviously isn’t a great idea.
In facts, with most modern compilers (on hardware that supports memory protection) the memory area of the string table is read-only, so if you attempt to modify a string literal your program crashes. Adding
constto pointers that refer to string literals makes these mistakes immediately evident as compilation errors instead of crashes.By the way, notice that the fact that a string literal can decay implicitly to a non-const
char *is just a concession to backwards compatibility with pre-standard libraries (written whenconstwasn’t part of the C language yet), as said above the standard always said that changing string literals is UB.