Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 396171
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T16:31:32+00:00 2026-05-12T16:31:32+00:00

How do I data mine a pile of text to get keywords by usage?

  • 0

How do I data mine a pile of text to get keywords by usage? (“Jacob Smith” or “fence”)

And is there a software to do this already? even semi-automatically, and if it can filter out simple words like “the”, “and”, “or”, then I could get to the topics quicker.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T16:31:33+00:00Added an answer on May 12, 2026 at 4:31 pm

    The general algorithm is going to go like this:

    - Obtain Text
    - Strip punctuation, special characters, etc.
    - Strip "simple" words
    - Split on Spaces
    - Loop Over Split Text
        - Add word to Array/HashTable/Etc if it doesn't exist;
           if it does, increment counter for that word
    

    The end result is a frequency count of all words in the text. You can then take these values and divide by the total number of words to get a percentage of frequency. Any further processing is up to you.

    You’re also going to want to look into Stemming. Stemming is used to reduce words to their root. For example going => go, cars => car, etc.

    An algorithm like this is going to be common in spam filters, keyword indexing and the like.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This seems to be a very common problem of mine: data = [1 2
I'm copying and updating data from another database to mine (programmatically), this works fine
For example, I wish to mine https://stackoverflow.com/privileges/user/3 and get the data that is in
I want to data mine dribbble.com so I can make an app that registers
What ready available algorithms could I use to data mine twitter to find out
Data: a dependency list, already verified to be acyclic. So here, 'a' depends on
.data VALS: .half 0xbead, 0xface RES: .space 4 .text la $t0,VALS lh $t1,($t0) lhu
I'm a .NET web developer primarily who occasionally writes console applications to mine data,
Yes that sounds overly complicated. I am trying to mine data from pages on
I need to write a custom web-scraper to mine some data. ?I know how

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.