Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8495317
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T23:32:27+00:00 2026-06-10T23:32:27+00:00

I am in the process of making a small program that reads a file,

  • 0

I am in the process of making a small program that reads a file, that contains UTF-8 elements, char by char. After reading a char it compares it with a few other characters and if there is a match it replaces the character in the file with an underscore ‘_’.

(Well, it actually makes a duplicate of that file with specific letters replaced by underscores.)

I’m not sure where exactly I’m messing up here but it’s most likely everywhere.

Here is my code:

   FILE *fpi;
   FILE *fpo;
   char ifilename[FILENAME_MAX];
   char ofilename[FILENAME_MAX];
   wint_t sample;


   fpi = fopen(ifilename, "rb");
   fpo = fopen(ofilename, "wb");

   while (!feof(fpi)) {
     fread(&sample, sizeof(wchar_t*), 1, fpi);

     if ((wcscmp(L"ά", &sample) == 0) || (wcscmp(L"ε", &sample) == 0)  ) {
   fwrite(L"_", sizeof(wchar_t*), 1, fpo);

     } else {
       fwrite(&sample, sizeof(wchar_t*), 1, fpo);

     }
   } 

I have omitted the code that has to do with the filename generation because it has nothing to offer to the case. It is just string manipulation.

If I feed this program a file containing the words γειά σου κόσμε. I would want it to return this:
γει_ σου κόσμ_.

Searching the internet didn’t help much as most results were very general or talking about completely different things regarding UTF-8. It’s like nobody needs to manipulate single characters for some reason.

Anything pointing me the right way is most welcome.
I am not, necessarily, looking for a straightforward fixed version of the code I submitted, I would be grateful for any insightful comments helping me understand how exactly the wchar mechanism works. The whole wbyte, wchar, L, no-L, thing is a mess to me.

Thank you in advance for your help.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T23:32:29+00:00Added an answer on June 10, 2026 at 11:32 pm

    First of all, please do take the time to read this great article, which explains UTF8 vs Unicode and lots of other important things about strings and encodings: http://www.joelonsoftware.com/articles/Unicode.html

    What you are trying to do in your code is read in unicode character by character, and do comparisons with those. That’s won’t work if the input stream is UTF8, and it’s not really possible to do with quite this structure.

    In short: Fully unicode strings can be encoded in several ways. One of them is using a series of equally-sized “wide” chars, one for each character. That is what the wchar_t type (sometimes WCHAR) is for. Another way is UTF8, which uses a variable number of raw bytes to encode each character, depending on the value of the character.

    UTF8 is just a stream of bytes, which can encode a unicode string, and is commonly used in files. It is not the same as a string of WCHARs, which are the more common in-memory representation. You can’t poke through a UTF8 stream reliably, and do character replacements within it directly. You’ll need to read the whole thing in and decode it, and then loop through the WCHARs that result to do your comparisons and replacement, and then map that result back to UTF8 to write to the output file.

    On Win32, use MultiByteToWideChar to do the decoding, and you can use the corresponding WideCharToMultiByte to go back.

    When you use a "string literal" with regular quotes, you’re creating a nul-terminated ASCII string (char*), which does not support Unicode. The L"string literal" with the L prefix will create a nul-terminated string of WCHARs (wchar_t *), which you can use in string or character comparisons. The L prefix also works with single-quote character literals, like so: L'ε'


    As a commenter noted, when you use fread/fwrite, you should be using sizeof(wchar_t) and not its pointer type, since the amount you are trying to read/write is an actual wchar, not the size of a pointer to one. This advice is just code feedback independent of the above– you don’t want to be reading the input character by character anyways.

    Note too that when you do string comparisons (wcscmp), you should use actual wide strings (which are terminated with a nul wide char)– not use single characters in memory as input. If (when) you want to do character-to-character comparisons, you don’t even need to use the string functions. Since a WCHAR is just a value, you can compare directly: if (sample == L'ά') {}.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am making a small program using C++/CLI that use TcpListener , I know
I am in the process of making a website that involves a shopping cart.
I am in process of making a small game, where the main Gameplay Layer
Good afternoon. I have been making a small openGL based app for android that
I'm looking for a really really small linux distribution or process of making my
I've written a small 2D engine in opengl in the process of making a
I'm working on writing a function in Clojure that will process a file character
I'm in the process of making a small query thing for a set of
I am making a program that will have the ability to run the java
I am currently in the process of making an application that runs 24/7 and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.