I’m learning the C language on Linux now and I’ve came across a little

Question

0

Asked: June 1, 20262026-06-01T09:55:53+00:00 2026-06-01T09:55:53+00:00

I’m learning the C language on Linux now and I’ve came across a little

0

I’m learning the C language on Linux now and I’ve came across a little weird situation.

As far as I know, the standard C’s char data type is ASCII, 1 byte (8 bits). It should mean, that it can hold only ASCII characters.

In my program I use char input[], which is filled by getchar function like this pseudocode:

char input[20];
int z, i;
for(i = 0; i < 20; i++)
{
   z = getchar();
   input[i] = z;
}

The weird thing is that it works not only for ASCII characters, but for any character I imagine, such as @&@{čřžŧ¶'`[łĐŧđĐ¶←^€~[←^ø{&}čž on the input.

My question is – how is it possible? It seems to be one of many beautiful exceptions in C, but I would really appreciate explanation. Is it a matter of OS, compiler, hidden language’s additional super-feature?

Thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T09:55:55+00:00

There is no magic here – The C language gives you acess to the raw bytes, as they are stored in the computer memory.
If your terminal is using utf-8 (which is likely), non-ASCII chars take more than one byte in memory. When you display then again, is our terminal code which converts these sequences into a single displayed character.

Just change your code to print the strlen of the strings, and you will see what I mean.

To properly handle utf-8 non-ASCII chars in C you have to use some library to handle them for you, like glib, qt, or many others.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m learning the C language on Linux now and I’ve came across a little

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply