Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 5979333
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 22, 20262026-05-22T21:38:40+00:00 2026-05-22T21:38:40+00:00

Based on this question which was closed rather quickly: Trying to create a program

  • 0

Based on this question which was closed rather quickly:
Trying to create a program to read a users input then break the array into seperate words are my pointers all valid?

Rather than closing I think some extra work could have gone into helping the OP to clarify the question.

The Question:

I want to tokenize user input and store the tokens into an array of words.
I want to use punctuation (.,-) as delimiter and thus removed it from the token stream.

In C I would use strtok() to break an array into tokens and then manually build an array.
Like this:

The main Function:

char **findwords(char *str);

int main()
{
    int     test;
    char    words[100]; //an array of chars to hold the string given by the user
    char    **word;  //pointer to a list of words
    int     index = 0; //index of the current word we are printing
    char    c;

    cout << "die monster !";
    //a loop to place the charecters that the user put in into the array  

    do
    {
        c = getchar();
        words[index] = c;
    }
    while (words[index] != '\n');

    word = findwords(words);

    while (word[index] != 0) //loop through the list of words until the end of the list
    {
        printf("%s\n", word[index]); // while the words are going through the list print them out
        index ++; //move on to the next word
    }

    //free it from the list since it was dynamically allocated
    free(word);
    cin >> test;

    return 0;
}

The line tokenizer:

char **findwords(char *str)
{
    int     size = 20; //original size of the list 
    char    *newword; //pointer to the new word from strok
    int     index = 0; //our current location in words
    char    **words = (char **)malloc(sizeof(char *) * (size +1)); //this is the actual list of words

    /* Get the initial word, and pass in the original string we want strtok() *
     *   to work on. Here, we are seperating words based on spaces, commas,   *
     *   periods, and dashes. IE, if they are found, a new word is created.   */

    newword = strtok(str, " ,.-");

    while (newword != 0) //create a loop that goes through the string until it gets to the end
    {
        if (index == size)
        {
            //if the string is larger than the array increase the maximum size of the array
            size += 10;
            //resize the array
            char **words = (char **)malloc(sizeof(char *) * (size +1));
        }
        //asign words to its proper value
        words[index] = newword;
        //get the next word in the string
        newword = strtok(0, " ,.-");
        //increment the index to get to the next word
        ++index;
    }
    words[index] = 0;

    return words;
}

Any comments on the above code would be appreciated.
But, additionally, what is the best technique for achieving this goal in C++?

  • 1 1 Answer
  • 1 View
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-22T21:38:40+00:00Added an answer on May 22, 2026 at 9:38 pm

    Already covered by a lot of questions is how to tokenize a stream in C++.
    Example: How to read a file and get words in C++

    But what is harder to find is how get the same functionality as strtok():

    Basically strtok() allows you to split the string on a whole bunch of user defined characters, while the C++ stream only allows you to use white space as a separator. Fortunately the definition of white space is defined by the locale so we can modify the locale to treat other characters as space and this will then allow us to tokenize the stream in a more natural fashion.

    #include <locale>
    #include <string>
    #include <sstream>
    #include <iostream>
    
    // This is my facet that will treat the ,.- as space characters and thus ignore them.
    class WordSplitterFacet: public std::ctype<char>
    {
        public:
            typedef std::ctype<char>    base;
            typedef base::char_type     char_type;
    
            WordSplitterFacet(std::locale const& l)
                : base(table)
            {
                std::ctype<char> const&  defaultCType  = std::use_facet<std::ctype<char> >(l);
    
                // Copy the default value from the provided locale
                static  char data[256];
                for(int loop = 0;loop < 256;++loop) { data[loop] = loop;}
                defaultCType.is(data, data+256, table);
    
                // Modifications to default to include extra space types.
                table[',']  |= base::space;
                table['.']  |= base::space;
                table['-']  |= base::space;
            }
        private:
            base::mask  table[256];
    };
    

    We can then use this facet in a local like this:

        std::ctype<char>*   wordSplitter(new WordSplitterFacet(std::locale()));
    
        <stream>.imbue(std::locale(std::locale(), wordSplitter));
    

    The next part of your question is how would I store these words in an array. Well, in C++ you would not. You would delegate this functionality to the std::vector/std::string. By reading your code you will see that your code is doing two major things in the same part of the code.

    • It is managing memory.
    • It is tokenizing the data.

    There is basic principle Separation of Concerns where your code should only try and do one of two things. It should either do resource management (memory management in this case) or it should do business logic (tokenization of the data). By separating these into different parts of the code you make the code more generally easier to use and easier to write. Fortunately in this example all the resource management is already done by the std::vector/std::string thus allowing us to concentrate on the business logic.

    As has been shown many times the easy way to tokenize a stream is using operator >> and a string. This will break the stream into words. You can then use iterators to automatically loop across the stream tokenizing the stream.

    std::vector<std::string>  data;
    for(std::istream_iterator<std::string> loop(<stream>); loop != std::istream_iterator<std::string>(); ++loop)
    {
        // In here loop is an iterator that has tokenized the stream using the
        // operator >> (which for std::string reads one space separated word.
    
        data.push_back(*loop);
    }
    

    If we combine this with some standard algorithms to simplify the code.

    std::copy(std::istream_iterator<std::string>(<stream>), std::istream_iterator<std::string>(), std::back_inserter(data));
    

    Now combining all the above into a single application

    int main()
    {
        // Create the facet.
        std::ctype<char>*   wordSplitter(new WordSplitterFacet(std::locale()));
    
        // Here I am using a string stream.
        // But any stream can be used. Note you must imbue a stream before it is used.
        // Otherwise the imbue() will silently fail.
        std::stringstream   teststr;
        teststr.imbue(std::locale(std::locale(), wordSplitter));
    
        // Now that it is imbued we can use it.
        // If this was a file stream then you could open it here.
        teststr << "This, stri,plop";
    
        cout << "die monster !";
        std::vector<std::string>    data;
        std::copy(std::istream_iterator<std::string>(teststr), std::istream_iterator<std::string>(), std::back_inserter(data));
    
        // Copy the array to cout one word per line
        std::copy(data.begin(), data.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Based on this question it appears that the default template for CheckStyle will allow
Based on this question I don't want to litter my ready stuff waiting for
This question is based on another question of mine (thankfully answered). So if in
Based on the response to this question: Why does C++ have header files and
The same as this question but for java Update Based on the comments and
Being new to test based development, this question has been bugging me. How much
Ok this is more of a computer science question, than a question based on
I have two tables that are related, which, for the purpose of this question,
Using this question as the base is there an alogrithm or coding example to
For the purposes of this question, the code base is an ASP.NET website that

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.