Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1104913
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T01:33:01+00:00 2026-05-17T01:33:01+00:00

Yesterday I discovered an odd bug in rather simple code that basically gets text

  • 0

Yesterday I discovered an odd bug in rather simple code that basically gets text from an ifstream and tokenizes it. The code that actually fails does a number of get()/peek() calls looking for the token “/*”. If the token is found in the stream, unget() is called so the next method sees the stream starting with the token.

Sometimes, seemingly depending only on the length of the file, the unget() call fails. Internally it calls pbackfail() which then returns EOF. However after clearing the stream state, I can happily read more characters so it’s not exactly EOF..

After digging in, here’s the full code that easily reproduces the problem:

#include <iostream>
#include <fstream>
#include <string>

  //generate simplest string possible that triggers problem
void GenerateTestString( std::string& s, const size_t nSpacesToInsert )
{
  s.clear();
  for( size_t i = 0 ; i < nSpacesToInsert ; ++i )
    s += " ";
  s += "/*";
}

  //write string to file, then open same file again in ifs
bool WriteTestFileThenOpenIt( const char* sFile, const std::string& s, std::ifstream& ifs )
{
  {
    std::ofstream ofs( sFile );
    if( ( ofs << s ).fail() )
      return false;
  }
  ifs.open( sFile );
  return ifs.good();
}

  //find token, unget if found, report error, show extra data can be read even after error 
bool Run( std::istream& ifs )
{
  bool bSuccess = true;

  for( ; ; )
  {
    int x = ifs.get();
    if( ifs.fail() )
      break;
    if( x == '/' )
    {
      x = ifs.peek();
      if( x == '*' )
      {
        ifs.unget();
        if( ifs.fail() )
        {
          std::cout << "oops.. unget() failed" << std::endl;
          bSuccess = false;
        }
        else
        {
          x = ifs.get();
        }
      }
    }
  }

  if( !bSuccess )
  {
    ifs.clear();
    std::string sNext;
    ifs >> sNext;
    if( !sNext.empty() )
      std::cout << "remaining data after unget: '" << sNext << "'" << std::endl;
  }

  return bSuccess;
}

int main()
{
  std::string s;
  const char* testFile = "tmp.txt";
  for( size_t i = 0 ; i < 12290 ; ++i )
  {
    GenerateTestString( s, i );

    std::ifstream ifs;
    if( !WriteTestFileThenOpenIt( testFile, s, ifs ) )
    {
      std::cout << "file I/O error, aborting..";
      break;
    }

    if( !Run( ifs ) )
      std::cout << "** failed for string length = " << s.length() << std::endl;
  }
  return 0;
}

The program fails when the string length gets near the typical multiple=of-2 buffersizes 4096, 8192, 12288, here’s the output:

oops.. unget() failed
remaining data after unget: '*'
** failed for string length = 4097
oops.. unget() failed
remaining data after unget: '*'
** failed for string length = 8193
oops.. unget() failed
remaining data after unget: '*'
** failed for string length = 12289

This happens when tested on on Windows XP and 7, both compiled in debug/release mode, both dynamic/static runtime, both 32bit and 64bit systems/compiles, all with VS2008, default compiler/linker options.
No problem found when testing with gcc4.4.5 on a 64bit Debian system.

Questions:

  1. can other people please test this? I would really appreciate some active collaboration form SO.
  2. is there anything that is not correct in the code that could cause the problem (not talking about whether it makes sense)
  3. or any compiler flags that might trigger this behaviour?
  4. all parser code is rather critical for the application and is tested heavily, but off course this problem was not found in the test code. Should I come up with extreme test cases, and if so, how do I do that? How could I ever predict this could cause a problem?
  5. if this really is a bug, where should do I best report it?
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T01:33:02+00:00Added an answer on May 17, 2026 at 1:33 am

    is there anything that is not correct in the code that could cause the problem (not talking about whether it makes sense)

    Yes. Standard streams are required to have at least 1 unget() position. So you can safely do only one unget() after a call to get(). When you call peek() and the input buffer is empty, underflow() occurs and the implementation clears the buffer and loads a new portion of data. Note that peek() doesn’t increase current input location, so it points to the beginning of the buffer. When you try to unget() the implementation tries to decrease current input position, but it’s already at the beginning of the buffer so it fails.

    Of course this depends on the implementation. If the stream buffer holds more than one character then it may sometimes fail and sometimes not. As far as I know microsoft’s implementation stores only one character in basic_filebuf (unless you specify a greater buffer explicitly) and relies on <cstdio> internal buffering (btw, that’s one reason why MVS iostreams are slow). Quality implementation may load the buffer again from the file when unget() fails. But it isn’t required to do so.

    Try to fix your code so you don’t need more than one unget() position. If you really need it then wrap the stream with a stream that guarantees that unget() won’t fail (look at Boost.Iostreams). Also the code you posted is nonsense. It tries to unget() and then get() again. Why?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I asked a similar question yesterday but I included some code that basically took
I am new to LINQ and discovered yesterday that you can have multiple where
Yesterday I've discovered that Bash provides a means to mark variables as read-only, using
I discovered yesterday that I can't have a Class that uses a field named
Yesterday I updated to firefox 13 and I notice that this property is not
Yesterday, I asked this question regarding best practices for a simple information retrieval system
Yesterday, after about two years we re-deployed our application. Code hasn't changed, but the
Yesterday, my supervisor told me that tmp-file-based session should be THE answer to session
Yesterday I notice that some thing kept writing with 1.48M/s bursting speed every 30s
Yesterday i tried to use std::unordered_map and this code confused me how much memory

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.