Well, I completely get the most basic datatypes of C, like short, int, long, float, to be exact, all numerical types.These types are needed to be known perform right operations with right numbers. For example to use FPU to add two float numbers. So the compiler must know what the type is.
But, when it comes to characters I am little bit off. I know that basic C datatype char is there for ASCII characters coding. But what I don´t know is, why you even need another datatype for characters. Why could not you just use 1 byte integer value to store ASCII character. If you call printf, you apecify the datatype in the call, so you could say to printf that the integer represents ASCII character. I dont know how cout resolves datatype, but I guess you could just specify it somehow.
Another thing is, when you want to use Unicode, you must use datatype wchar. But, what if I would like to use some another, for example ISO, or Windows coding instead of UTF? Becouse wchar codes characters as UTF-16 or UTF-32 (I read its compiler specific). And, what if I would want to use for example some imaginary new 8 byte text coding? What datatype should I use for it? I am actually pretty confused of this, becouse I always expected that if I want to use UTF-32 instead of ASCII, I just tell compiler “get UTF-32 value of the character I typed and save it into 4 char field.” I thought that text coding is to be dealt with by the end, print function for example. That I just need to specify the coding for the compiler to use, since Windows doesent use ASCII in win32 apps, I guess C compiler must convert the char I typed to ASCII from whatever the type is that windows sends to the C editor.
And the last thing is, what if I want to use for example 25 Byte integer for some high math operations? C has no specify-yourself datatype. Yes, I know that this would be difficult since all the math operations would need to be changed, becouse CPU can not add 25 Bytes numbers together. But is there a way to do it? Or is there some math library for it? What if I want to compute Pi to 1000000000000000 digits? 🙂
I know my question is pretty long, but I just wanted to explain my thoughts the best I can in English, since its not my native language it is difficult. And I believe there is simple answer to my question(s), something I missed that explains everything. I read lot about text coding, C tutorials, but nothing about his. Thank you for your time.
Your question is very broad, I’ll try to address some specific issues you raised, hopefully it will get you abit more sorted out.
The
chartype can be though of as just another numerical type, just like int, short and long. It is totally ok to writechar a=3;. The difference is that withchars the compiler gives you some added value. instead of just numbers you can also assign ASCII characters to a variable likechar a='U';and then the variable will get the ASCII value of that character and you can also initialize arrays of character using literal strings like so:char *s="hello";.This doesn’t change the fact that after all char is still a numeric type and a string is just an array of numbers. If you’ll look at the memory of the string, you’ll see the ASCII codes of the string.
The choice of
charbeing 1 byte is arbitrary and is largely kept this way in C due to historical reasons. more modern languages like C# and Java define char to be 2 bytes.You don’t need “another” type for characters.
charis just the numeric type that holds a single singed/unsigned byte the same asshortis the numeric type that holds a signed 16 bit word. The fact that this data type is used for characters and strings is just syntactic sugar provided by the compiler. 1 byte integers == char.printf()only works with chars since this is the way C was designed. it it was designed today it would possibly be working with shorts. Indeed in windows you have a version ofprintf()which works with shorts, it is calledwprintf()the type
wchar_t, in windows, is just another name forshort. somewhere in the windows header files there is a decleration like this:typedef short wchar_t;which makes this happen. You can use them interchangeably. The advantage of using the wordwchar_tis that whoever reads your code knows that you now want to use characters rather than numbers. Another reason is that if there’s a remote chance that sometime Microsoft will decide that now they want to use UTF32 then all they need to do is redefine the typedef above to betypedef int wchar_t;and that’s it (in reality this will be quite abit more complicated to achieve so this change is unlikely in the for seeable future)If you want to use some 8-bit encoding that is not ASCII, for instance the encoding for hebrew which is called “Windows-1255” you just use chars. There are many such encodings but these days using UNICODE is always preferable. Indeed there is actually a version of Unicode itself which fits in 8-bit strings that is UTF-8. If you’re dealing with UTF-8 strings then you should work with the
chardata type. There is nothing that limits it to working with ASCII since it is just a number, it can mean anything.Working with such long numbers is usually done using something called “decimal types”. C doesn’t have this but C# does. The basic idea of these types is that they handle a number similar to a string. Every digit of the decimal representation is saved using 4 bits so an 8 bit variable can save the numbers in the range 0-99, a 3 byte array can save values in the range of 0-999999 and so on. This way you can save numbers of any range.
The downside to these is that making calculations on them is alot slower than making calculations on normal binary numbers.
I am not sure if there are libraries which do this kind of thing in C. Use google to find out.