Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8496341
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T23:49:32+00:00 2026-06-10T23:49:32+00:00

I am currently developing algorithms that work with hundreds of thousands of strings (~4000

  • 0

I am currently developing algorithms that work with hundreds of thousands of strings (~4000 chars each) and perform simple operations based on the results of functions applied to these strings. Currently I use Java and a Mysql database with one table:

 ID | String | attribute a | attribute b | ....
    |        |             |             | ....

Basically, the algorithm gets one ID to start with, reads the string that is stored, performs functions on it (Attributes are set and read for that currently active column). For example, one function extracts an ID from the String (simple string parsing), stores this ID in the “attribute a” column. Once the entry is parsed, the algorithm reads “attribute a”, jumps to the row with this ID and the process starts all over again.

Maybe I am over-thinking this a little bit; but the current set up has so much overhead, that it is nearly impossible to make some quick changes or to quickly test queries. Is there a better tool or programming language that has been designed for directly operating on large data sets like this and that provides efficient functions for string manipulation?

I definitely wouldn’t mind spending time on learning a completely new language as I believe that using the right tool for the job saves time and prevents frustration in the long term.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T23:49:34+00:00Added an answer on June 10, 2026 at 11:49 pm

    I have a pet project that I’ve been working on, on and off, for years. It stores a large number of strings (although not text). In the past I have implemented it in Java in-memory, Scala with a database, MySQL, C in-memory, Python + Redis… and finally, Go.

    Go has done the best job. I have ~300,000 strings (although shorter than yours) stored in a data structure in memory. They form a searchable, analyzable data structure. I’m sure the use case is similar enough to yours for my experience to be relevant.

    Go has similar efficiency to C for data processing. It has nice syntax like Python for quick coding. It has type safety for … type safety. It has garbage collection.

    My suggestion is, learn Go and do it all in-memory. Rely on virtual memory for accommodating a large data-set. Mine is about 500 MB in RAM once loaded, but I have no dobut it would function just fine at twice that.

    I do not persist to disk because I don’t need to. I can re-create the data structure in 15 minutes from input files. The application is a continually running server. If you’re running large batch operations to do analysis that can be suitable. Otherwise I am sure you can easily perisist to disk.

    (FWIW I’m talking about http://www.folktunefinder.com melody search index)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am currently developing a search engine and I have some implemented algorithms that
Problem background I am currently developing a framework of Ant Colony System algorithms. I
I'm currently developing a web application based on jQuery Mobile. I would like to
I am currently developing an application , that will be published on appstore. I
I'm currently developing a facebook app that needs access to a users photos. It
Im currently developing a Silverlight application that connects to an old webservice. Our old
Im currently developing a program that uses a scrollable/zoomable image as the main user
Im currently developing this site http://digitalgenesis.com.au/projects/sister/music.html What happens when i re-size screen is that
Im currently developing an application that needs to store a 10 to 20 digit
I am currently developing a small AI framework ( genetic algorithms / neural networks)

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.