Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6905361
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T08:10:27+00:00 2026-05-27T08:10:27+00:00

I’m trying to get the content between a double-quote to count as one token

  • 0

I’m trying to get the content between a double-quote to count as one token for an assignment.

For example:
“hello world” = 1 token
“hello” “world” = 3 tokens (because space counts as 1 token)

I created main.cpp and I added “scanQuotesAsString” code to 3 modules given:

  • scanner.cpp
  • scanner.h
  • scanpriv.h

Right now, “hello world” scans a 2 tokens, not skipping the space. If I add (or skipspace, then regular input such as |hello world| without quotes skips spaces as well.

I think my issue is in scanner.cpp, where the last couple functions are:

/*
* Private method: scanToEndOfIdentifier
* Usage: finish = scanToEndOfIdentifier();
* ----------------------------------------
* This function advances the position of the scanner until it
* reaches the end of a sequence of letters or digits that make
* up an identifier. The return value is the index of the last
* character in the identifier; the value of the stored index
* cp is the first character after that.
*/
int Scanner::scanToEndOfIdentifier() {
    while (cp < len && isalnum(buffer[cp])) {
        if ((stringOption == ScanQuotesAsStrings) && (buffer[cp] == '"')) 
            break;
        cp++;
    }
    return cp - 1;
}


/* Private functions */
/*
* Private method: scanQuotedString
* Usage: scanQuotedString();
* -------------------
* This function advances the position of the scanner until the
* current character is a double quotation mark
*/
void Scanner::scanQuotedString() {
    while ((cp < len && (buffer[cp] == '"')) || (cp < len && (buffer[cp] == '"'))){
        cp++;
    }

Here is main.cc

#include "genlib.h"
#include "simpio.h"
#include "scanner.h"
#include <iostream>

/* Private function prototypes */

int CountTokens(string str);

int main() {
    cout << "Please enter a sentence: ";
    string str = GetLine();

    int num = CountTokens(str);
    cout << "You entered " << num << " tokens." << endl;
    return 0;
}

int CountTokens(string str) {

    int count = 0;
    Scanner scanner;        // create new scanner object            
    scanner.setInput(str);  // initialize the input to be scanned

    //scanner.setSpaceOption(Scanner::PreserveSpaces);
    scanner.setStringOption(Scanner::ScanQuotesAsStrings);

    while (scanner.hasMoreTokens()) { // read tokens from the scanner
        scanner.nextToken();
        count++;
    }
    return count;
}

Here’s scanner.cpp

/*
* File: scanner.cpp
* -----------------
* Implementation for the simplified Scanner class.
*/
#include "genlib.h"
#include "scanner.h"
#include <cctype>
#include <iostream>
/*
* The details of the representation are inaccessible to the client,
* but consist of the following fields:
*
* buffer -- String passed to setInput
* len -- Length of buffer, saved for efficiency
* cp -- Current character position in the buffer
* spaceOption -- Setting of the space option extension
*/
Scanner::Scanner() {
    buffer = "";
    spaceOption = PreserveSpaces;
}
Scanner::~Scanner() {
/* Empty */
}
void Scanner::setInput(string str) {
    buffer = str;
    len = buffer.length();
    cp = 0;
}
/*
* Implementation notes: nextToken
* -------------------------------
* The code for nextToken follows from the definition of a token.
*/
string Scanner::nextToken() {
    if (cp == -1) {
        Error("setInput has not been called");
    }
    if (stringOption == ScanQuotesAsStrings) scanQuotedString();
    if (spaceOption == IgnoreSpaces) skipSpaces();
    int start = cp;
    if (start >= len) return "";
    if (isalnum(buffer[cp])) {
        int finish = scanToEndOfIdentifier();
        return buffer.substr(start, finish - start + 1);
    }
    cp++;
    return buffer.substr(start, 1);
}

bool Scanner::hasMoreTokens() {
    if (cp == -1) {
        Error("setInput has not been called");
    }
    if (stringOption == ScanQuotesAsStrings) scanQuotedString();
    if (spaceOption == IgnoreSpaces) skipSpaces();
    return (cp < len);
}

void Scanner::setSpaceOption(spaceOptionT option) {
    spaceOption = option;
}

Scanner::spaceOptionT Scanner::getSpaceOption() {
    return spaceOption;
}

void Scanner::setStringOption(stringOptionT option) {
    stringOption = option;
}

Scanner::stringOptionT Scanner::getStringOption() {
    return stringOption;
}


/* Private functions */
/*
* Private method: skipSpaces
* Usage: skipSpaces();
* -------------------
* This function advances the position of the scanner until the
* current character is not a whitespace character.
*/
void Scanner::skipSpaces() {
    while (cp < len && isspace(buffer[cp])) {
        cp++;
    }
}

    /*
    * Private method: scanToEndOfIdentifier
    * Usage: finish = scanToEndOfIdentifier();
    * ----------------------------------------
    * This function advances the position of the scanner until it
    * reaches the end of a sequence of letters or digits that make
    * up an identifier. The return value is the index of the last
    * character in the identifier; the value of the stored index
    * cp is the first character after that.
    */
    int Scanner::scanToEndOfIdentifier() {
        while (cp < len && isalnum(buffer[cp])) {
            if ((stringOption == ScanQuotesAsStrings) && (buffer[cp] == '"')) 
                break;
            cp++;
        }
        return cp - 1;
    }


    /* Private functions */
    /*
    * Private method: scanQuotedString
    * Usage: scanQuotedString();
    * -------------------
    * This function advances the position of the scanner until the
    * current character is a double quotation mark
    */
    void Scanner::scanQuotedString() {
        while ((cp < len && (buffer[cp] == '"')) || (cp < len && (buffer[cp] == '"'))){
            cp++;
        }

scanner.h

/*
* File: scanner.h
* ---------------
* This file is the interface for a class that facilitates dividing
* a string into logical units called "tokens", which are either
*
* 1. Strings of consecutive letters and digits representing words
* 2. One-character strings representing punctuation or separators
*
* To use this class, you must first create an instance of a
* Scanner object by declaring
*
* Scanner scanner;
*
* You initialize the scanner's input stream by calling
*
* scanner.setInput(str);
*
* where str is the string from which tokens should be read.
* Once you have done so, you can then retrieve the next token
* by making the following call:
*
* token = scanner.nextToken();
*
* To determine whether any tokens remain to be read, you can call
* the predicate method scanner.hasMoreTokens(). The nextToken
* method returns the empty string after the last token is read.
*
* The following code fragment serves as an idiom for processing
* each token in the string inputString:
*
* Scanner scanner;
* scanner.setInput(inputString);
* while (scanner.hasMoreTokens()) {
* string token = scanner.nextToken();
* . . . process the token . . .
* }
*
* This version of the Scanner class includes an option for skipping
* whitespace characters, which is described in the comments for the
* setSpaceOption method.
*/
#ifndef _scanner_h
#define _scanner_h
#include "genlib.h"
/*
* Class: Scanner
* --------------
* This class is used to represent a single instance of a scanner.
*/
class Scanner {
public:
/*
* Constructor: Scanner
* Usage: Scanner scanner;
* -----------------------
* The constructor initializes a new scanner object. The scanner
* starts empty, with no input to scan.
*/
    Scanner();
/*
* Destructor: ~Scanner
* Usage: usually implicit
* -----------------------
* The destructor deallocates any memory associated with this scanner.
*/
    ~Scanner();
/*
* Method: setInput
* Usage: scanner.setInput(str);
* -----------------------------
* This method configures this scanner to start extracting
* tokens from the input string str. Any previous input string is
* discarded.
*/
    void setInput(string str);
/*
* Method: nextToken
* Usage: token = scanner.nextToken();
* -----------------------------------
* This method returns the next token from this scanner. If
* nextToken is called when no tokens are available, it returns the
* empty string.
*/
    string nextToken();
/*
* Method: hasMoreTokens
* Usage: if (scanner.hasMoreTokens()) . . .
* ------------------------------------------
* This method returns true as long as there are additional
* tokens for this scanner to read.
*/
    bool hasMoreTokens();
/*
* Methods: setSpaceOption, getSpaceOption
* Usage: scanner.setSpaceOption(option);
* option = scanner.getSpaceOption();
* ------------------------------------------
* This method controls whether this scanner
* ignores whitespace characters or treats them as valid tokens.
* By default, the nextToken function treats whitespace characters,
* such as spaces and tabs, just like any other punctuation mark.
* If, however, you call
*
* scanner.setSpaceOption(Scanner::IgnoreSpaces);
*
* the scanner will skip over any white space before reading a
* token. You can restore the original behavior by calling
*
* scanner.setSpaceOption(Scanner::PreserveSpaces);
*
* The getSpaceOption function returns the current setting
* of this option.
*/
    enum spaceOptionT { PreserveSpaces, IgnoreSpaces };
    void setSpaceOption(spaceOptionT option);
    spaceOptionT getSpaceOption();

/*
 * Methods: setStringOption, getStringOption
 * Usage: scanner.setStringOption(option);
 *        option = scanner.getStringOption();
 * --------------------------------------------------
 * This method controls how the scanner reads double quotation marks 
 * as input.  The default is set to treat quotes just like any other 
 * punctuation character: 
 *    scanner.setStringOption(Scanner::ScanQuotesAsPunctuation);
 * 
 * Otherwise, the option:
 *    scanner.setStringOption(Scanner::ScanQuotesAsStrings);
 *
 * the token starting with a quotation mark will be scanned until
 * another quotation mark is found (closing quotation). Therefore
 * the entire string within the quotation, including both quotation
 * marks counts as 1 token.
 */
    enum stringOptionT { ScanQuotesAsPunctuation, ScanQuotesAsStrings };

    void setStringOption(stringOptionT option);
    stringOptionT getStringOption();


private:

#include "scanpriv.h"
};
#endif

** and finally scanpriv.h **

/*
* File: scanpriv.h
* ----------------
* This file contains the private data for the simplified version
* of the Scanner class.
*/

/* Instance variables */
string buffer; /* The string containing the tokens */
int len; /* The buffer length, for efficiency */
int cp; /* The current index in the buffer */
spaceOptionT spaceOption; /* Setting of the space option */
stringOptionT stringOption;

/* Private method prototypes */
void skipSpaces();
int scanToEndOfIdentifier();
void scanQuotedString();
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T08:10:28+00:00Added an answer on May 27, 2026 at 8:10 am

    To long to read.

    Two ways of parsing quoted text:

    0) State

    A simple switch that tells whether you are in quotes right now, and which activates some special quotation handling. This would basically be equivalent to #1), just inline.

    1) Sub-Rule in Recursive Descent Scanner

    Put the state away and write a separate rule for scanning quoted text. The code would actually be quite simple (C++ inspired p-code):

    // assume we are one behind the opening quotation mark
    for (c : text) {
        if (is_escape (*c)) {  // to support stuff like "foo's name is \"bar\""
            p = peek(c);
            if (!is_valid_escape_character (peek (c))) error;
            else {
                make the peeked character (*p) part of the result;
                ++c;
            }
        }
        else if (is_quotation_mark (*c))
        {
            return the result; // we approached the end of the string
        }
        else if (!is_valid_character (*c))
        {
            error; // maybe you want to forbid literal control characters
        }
        else
        {
            make *c part of the result
        }
    }
    error; // reached end of input before closing quotation mark
    

    If you do not want so support escape characters, the code gets simpler:

    // assume we are one behind the opening quotation mark
    for (c : text) {
        if (is_quotation_mark (*c))
            return the result;
        else if (!is_valid_character (*c))
            error;
        else
            make *c part of the result
    }
    error; // reached end of input before closing quotation mark
    

    You should not omit the check whether its an invalid character, as this would invite users to exploit your code and possibly make use of undefined behavior of your program.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to use string.replace('’','') to replace the dreaded weird single-quote character: ’ (aka
I'm having trouble keeping the paragraph square between the quote marks. In firefox the
Basically, what I'm trying to create is a page of div tags, each has
I am trying to understand how to use SyndicationItem to display feed which is
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I want to count how many characters a certain string has in PHP, but
I would like to count the length of a string with PHP. The string
Specifically, suppose I start with the string string =hello \'i am \' me And
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.