Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9097035
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 17, 20262026-06-17T00:00:37+00:00 2026-06-17T00:00:37+00:00

… that is the question. I have been working on an algorithm which takes

  • 0

… that is the question. I have been working on an algorithm which takes an array of vectors as input, and part of the algorithm repeatedly picks pairs of vectors and evaluates a function of these two vectors, which doesn’t change over time. Looking at ways to optimize the algorithm, I thought this would be a good case for memoization: instead of recomputing the same function value over and over again, cache it lazily and hit the cache.

Before jumping to code, here is the gist of my question: the benefits I get from memoization depend on the number of vectors, which I think is inversely related to number of repeated calls, and in some circumstances memoization completely degrades performance. So is my situation inadequate for memoization? Am I doing something wrong, and are there smarter ways to optimize for my situation?

Here is a simplified test script, which is fairly close to the real thing:

open System
open System.Diagnostics
open System.Collections.Generic

let size = 10 // observations
let dim = 10 // features per observation
let runs = 10000000 // number of function calls

let rng = new Random()
let clock = new Stopwatch()

let data =
    [| for i in 1 .. size ->
        [ for j in 1 .. dim -> rng.NextDouble() ] |]    
let testPairs = [| for i in 1 .. runs -> rng.Next(size), rng.Next(size) |]

let f v1 v2 = List.fold2 (fun acc x y -> acc + (x-y) * (x-y)) 0.0 v1 v2

printfn "Raw"
clock.Restart()
testPairs |> Array.averageBy (fun (i, j) -> f data.[i] data.[j]) |> printfn "Check: %f"
printfn "Raw: %i" clock.ElapsedMilliseconds

I create a list of random vectors (data), a random collection of indexes (testPairs), and run f on each of the pairs.

Here is the memoized version:

let memoized =
    let cache = new Dictionary<(int*int),float>(HashIdentity.Structural)
    fun key ->
        match cache.TryGetValue(key) with
        | true, v  -> v
        | false, _ ->
            let v = f data.[fst key] data.[snd key]
            cache.Add(key, v)
            v

printfn "Memoized"
clock.Restart()
testPairs |> Array.averageBy (fun (i, j) -> memoized (i, j)) |> printfn "Check: %f"
printfn "Memoized: %i" clock.ElapsedMilliseconds

Here is what I am observing:
* when size is small (10), memoization goes about twice as fast as the raw version,
* when size is large (1000), memoization take 15x more time than raw version,
* when f is costly, memoization improves things

My interpretation is that when the size is small, we have more repeat computations, and the cache pays off.

What surprised me was the huge performance hit for larger sizes, and I am not certain what is causing it. I know I could improve the dictionary access a bit, with a struct key for instance – but I didn’t expect the “naive” version to behave so poorly.

So – is there something obviously wrong with what I am doing? Is memoization the wrong approach for my situation, and if yes, is there a better approach?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-17T00:00:38+00:00Added an answer on June 17, 2026 at 12:00 am

    I think memoization is a useful technique, but it is not a silver bullet. It is very useful in dynamic programming where it reduces the (theoretical) complexity of the algorithm. As an optimization, it can (as you would probably expect) have varying results.

    In your case, the cache is certainly more useful when the number of observations is smaller (and f is more expensive computation). You can add simple statistics to your memoization:

    let stats = ref (0, 0) // Count number of cache misses & hits
    let memoized =
        let cache = new Dictionary<(int*int),float>(HashIdentity.Structural)
        fun key ->
            let (mis, hit) = !stats
            match cache.TryGetValue(key) with
            | true, v  -> stats := (mis, hit + 1); v // Increment hit count
            | false, _ ->
                stats := (mis + 1, hit); // Increment miss count
                let v = f data.[fst key] data.[snd key]
                cache.Add(key, v)
                v
    
    • For small size, the numbers I get are something like (100, 999900) so there is a huge benefit from memoization – the function f is computed 100x and then each result is reused 9999x.

    • For big size, I get something like (632331, 1367669) so f is called many times and each result is reused just twice. In that case, the overhead with allocation and lookup in the (big) hash table is much bigger.

    As a minor optimization, you can pre-allocate the Dictionary and write new Dictionary<_, _>(10000,HashIdentity.Structural), but that does not seem to help much in this case.

    To make this optimization efficient, I think you would need to know some more information about the memoized function. In your example, the inputs are quite regular, so there is porbably no point in memoization, but if you know that the function is more often called with some values of arguments, you can perhaps only memoize only for these common arguments.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

so I did $subject = 'sakdlfjsalfdjslfad <a href=something/8230>lol is that true?</a> lalalala'; $subject =
I have an element like this: <span class=tool_tip title=The full title>The ful&#8230;</span> This seems
I have been making a wordpress template. i got stuck at some place... the
I have this xml <entry id=1008 section=articles> <excerpt><p>&#8230; in Richtung „Aus für Tierversuche. Kosmetik-Fertigprodukte
I see that some rss on xml have strange strings. For example, ... is
After much research, I have found that Laravel is very nice. I have a
Why does the Android system throw this Exception? 05-18 12:33:44.169 W/System.err( 8230): java.io.IOException: Is
I have a form, one of the fields is a select field, with 5
I would like to run a str_replace or preg_replace which looks for certain words
I have a PDB file. Now it has two parts separated by TER. Before

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.