Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7599803
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 30, 20262026-05-30T22:45:02+00:00 2026-05-30T22:45:02+00:00

I use a lucene snowball analyzer to perform stemming . The results are not

  • 0

I use a lucene snowball analyzer to perform stemming . The results are not meaningful words . I referred this question .

One of the solution is to use a database that contains a map between the stemmed version of the word to one stable version of the word . (Example from communiti to community no matter what the base was for communti (communities / or some other word))

I want to know if there is a database which performs such a function.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-30T22:45:03+00:00Added an answer on May 30, 2026 at 10:45 pm

    It is theoretically impossible to recover a specific word from a stem, since one stem can be common to many words. One possibility, depending on your application, would be to build a database of stems each mapped to an array of several words. But you would then need to predict which one of those words is appropriate given a stem to re-convert.

    As a very naive solution to this problem, if you know the word tags, you could try storing words with the tags in your database:

    run:
       NN:  runner
       VBG: running
       VBZ: runs
    

    Then, given the stem “run” and the tag “NN”, you could determine that “runner” is the most probable word in that context. Of course, that solution is far from perfect. Notably, you’d need to handle the fact that the same word form might be tagged differently in different contexts. But remember that any attempt to solve this problem will be, at best, an approximation.

    Edit: from the comments below, it looks like you probably want to use lemmatization instead of stemming. Here’s how to get the lemmas of words using the Stanford Core NLP tools:

    import java.util.*;
    
    import edu.stanford.nlp.pipeline.*;
    import edu.stanford.nlp.ling.*;
    import edu.stanford.nlp.ling.CoreAnnotations.*;
    
    Properties props = new Properties();
    
    props.put("annotators", "tokenize, ssplit, pos, lemma");
    pipeline = new StanfordCoreNLP(props, false);
    String text = "Hello, world!";
    Annotation document = pipeline.process(text);
    
    for(CoreMap sentence: document.get(SentencesAnnotation.class)) {
        for(CoreLabel token: sentence.get(TokensAnnotation.class)) {
            String word = token.get(TextAnnotation.class);
            String lemma = token.get(LemmaAnnotation.class);
        }
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I want to use these two classes from lucene - import org.apache.lucene.analysis.snowball.*; import org.apache.lucene.analysis.PorterStemmer;
In my project we use Lucene 2.4.1 for fulltext search. This is a J2EE
I use Lucene.net to index content and documents etc.. on our CMS. This has
I'm trying to use lucene 4.0 snapshot version, however StandardAnalyzer is missing in this
We use Lucene.net for indexing. One of the fields that we index, is a
First, I do not want to use Lucene as a database, per se, but
Context This is a question mainly about Lucene (or possibly Solr) internals. The main
I want to use Lucene.NET for fulltext search shared between two apps: one is
I'm using lucene in my project. Here is my question: should I use lucene
I want to use Lucene in my project. When I simply copy the .jar

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.