Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6346885
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T21:06:15+00:00 2026-05-24T21:06:15+00:00

I am building a natural language processor in C#, and many ‘words’ in our

  • 0

I am building a natural language processor in C#, and many ‘words’ in our database are actually multiple-word phrases that refer to one noun or action. Please, no discussion on this design call, suffice it to say it is not changeable at this time. I have string arrays of related words (chunks) of the sentence that I need to test for these phrases and words. What is an appropriately idiomatic way to handle sub-array extraction so I run the least risk of overflow errors and the like?

To give an example of the desired logic, let me step through a run with a sample chunk. For our purposes, assume that the only multiple-word phrase from the database is ‘quick brown’.

Full phrase: The quick brown fox -> encoded as {"The", "quick", "brown", "fox"}
First iteration: Test "The quick brown fox" -> returns nothing
Second iteration: Test "The quick brown" -> returns nothing
Third iteration: Test "The quick" -> returns nothing
Fourth iteration: Test "The" -> returns value
Fifth iteration: Test "quick brown fox" -> returns nothing
Sixth iteration: Test "quick brown" -> returns value
Seventh iteration: Test "fox" -> returns value

Sum all returned values and return.

I have some ideas of how to go about this but the more I look at things the more I am really getting worried about array addressing errors and other such horrors plaguing my code. The phrase is coming in as a string array, but I’m fine with putting it to IEnumerable. My only concern there lies in an Enumerable’s lack of an index.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T21:06:16+00:00Added an answer on May 24, 2026 at 9:06 pm

    The path forward here lay in combining Mark’s and Philipp’s answers. Under ideal circumstances I would have edited one of their posts with it but it appears as though my edits were denied.

    Anyway, I took the DelimitedArray that Mark linked and changed a few things in it:

    Constructor changed to:

        public DelimitedArray(T[] array, int offset, int count, bool throwErrors = false)
        {
            this.array = array;
            this.offset = offset;
            this.count = count;
            this.throwErrors = throwErrors;
        }
    

    Index reference changed to:

    public T this[int index]
        {
            get
            {
                int idx = this.offset + index;
                if (idx > this.Count - 1 || idx < 0)
                {
                    if (throwErrors == true)
                        throw new IndexOutOfRangeException("Index '" + idx + "' was outside the bounds of the array.");
                    return default(T);
                }
                return this.array[idx];
            }
        }
    

    I then worked that in to Philipp’s loop usage. This becomes:

            for (var start = 0; start < words.Length - 2; start++) // at least one word
            {
                for (var end = start + 1; end < words.Length - 1; end++)
                {
                    var segment = new DelimitedArray<string>(words, start, end - start);
                    lemma = string.Join(" ", segment.GetEnumerator()); // get the word/phrase to test
                    result = this.DoTheTest(lemma);
    
                    if (result > 0)
                    {
                        // Add the new result
                        ret = ret + result;
    
                        // Move the start sentinel up, mindful of the +1 that will happen at the end of the loop
                        start = start + segment.Count - 1;
                        // And instantly finish the end sentinel; we're done here.
                        end = words.Length;
                    }
                }
            }
    

    If I could accept more than one answer I’d mark both of their answers but as both of them are incomplete I will have to accept my own when I am able to do so tomorrow.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm building a library for conversational natural language processing . In many ways it
For a database I'm building, I've decided to use natural numbers as the primary
I am building a project involving natural language processing, since the nlp module currently
I'm building a system that needs to provide a commentary on things in natural
building a site using PHP and MySQL that needs to store a lot of
Building a website that has English & Japanese speaking users, with the Japanese users
Building an iPhone OS application that will allow users to anonymously post information to
Good afternoon all, I am building a function that takes a string as input,
Building a commercial product may use various open source libraries that have use of
Building a relatively simple website, and need to store some data in the database

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.