Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8380773
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T16:32:31+00:00 2026-06-09T16:32:31+00:00

I have a bunch of String s I’d like a fast lookup for. Each

  • 0

I have a bunch of Strings I’d like a fast lookup for. Each String is 22 chars long and is looked up by the first 12 only (the “key” so to say), the full set of Strings is recreated periodically. They are loaded from a file and refreshed when the file changes. I have to deal with too little available memory, other server processes on my VPS need it too and need it more.

How do I best store the Strings and search for them?

My current idea is to store them all one after another inside a char[] (to save RAM), and sort them for faster lookups (I figure the lookup is fastest if I have them presorted so I can use binary or interpolation search). But I’m not exactly sure how I should code it – if anyone is in the mood for a challenging puzzle: here it is…

Btw: It’s probably ok to exceed the memory constraints for a while during the recreation / sorting, but it shouldn’t be by much or for long.

Thanks!

Update

For the “I want to know specifics” crowd (correct me if I’m wrong in the Java details): The source files contain about 320 000 entries (all ANSI text), I really want to stay (WAY!) below 64 MB RAM usage and the data is only part of my program. Here’s some information on sizes of Java types in memory.

My VPS is a 32bit OS, so…

  • one byte[], all concatenated = 12 + length bytes
  • one char[], all concatenated = 12 + length * 2 bytes
  • String = 32 + length * 2 bytes (is Object, has char[] + 3 int)

So I have to keep in memory:

  • ~7 MB if all are stored in a byte[]
  • ~14 MB if all are stored in a char[]
  • ~25 MB if all are stored in a String[]
  • > 40 MB if they are stored in a HashTable / Map (for which I’d probably have to finetune the initial capacity)

A HashTable is not magical – it helps on insertion, but in principle it’s just a very long array of String where the hashCode modulus capacity is used as an index, the data is stored in the next free position after the index and searched lineary if it’s not found there on lookup. But for a Hashtable, I’d need the String itself and a substring of the first 12 chars for lookup. I don’t want that (or do I miss something here?), sorry folks…

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T16:32:33+00:00Added an answer on June 9, 2026 at 4:32 pm

    I coded a solution myself – but it’s a little different than the question I posted because I could use information I didn’t publish (I’ll do better next time, sorry).

    I’m just answering this because it’s solved, I won’t accept one of the other answers because they didn’t really help with the memory constraints (and were a little short for my taste). They still got an upvote each, no hard feelings and thanks for taking the time!

    I managed to push all of the info into two longs (with the key completely residing in the first one). The first 12 chars are an ISIN which can be compressed into a long because it only uses digits and capital letters, always starts with two capital letters and ends with a digit which can be reconstructed from the other chars. The product of all possible values leaves a little more than 3 bits to spare.

    I store all entries from my source file in a long[] (packed ISIN first, other stuff in the second long) and sort them based on the first of two longs.

    When I do a query by a key, I transform it to a long, do a binary search (which I’ll maybe change to an interpolation search) and return the matching index. The different parts of the value are retrievable by said index – I get the second long from the array, unpack it and return the requested data.

    The result: RAM usage dropped from ~110 MB to < 50 MB including Jetty (btw – I used a HashTable before) and lookups are lightning fast.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have bunch of strings, some of which are fairly long, like so: movie.titles
I have a bunch of strings, each containing an anchor tag and url. string
I have a bunch of strings that are dependent on static dictionaries and each
I have a bunch of functions like: f1 :: String -> String -> ...
I have a bunch of Widget objects. Now each widget has a string property
Basically, I have a bunch of data where the first column is a string
I have a bunch of numbers that are in string format,I would like some
I have a bunch of key/value pairs I'd like to cache for my WPF
I have a bunch of strings with punctuation in them that I'd like to
I have a bunch of strings like this: my $string1 = xg0000; my $string2

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.