Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 901877
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T15:37:43+00:00 2026-05-15T15:37:43+00:00

I have a Haskell program which processes a text file and builds a Map

  • 0

I have a Haskell program which processes a text file and builds a Map (with several million elements). The whole thing can run for 2-3 minutes. I found that tweaking the -H and -A options makes a big difference in running time.

There is documentation about this functionality of the RTS, but it’s a hard read for me since I don’t know the algorithms and terms from GC theory. I’m looking for a less technical explanation, preferably specific to Haskell/GHC. Are there any references about choosing sensible values for these options?

EDIT:
That’s the code, it builds a trie for a given list of words.

buildTrie :: [B.ByteString] -> MyDFA 
buildTrie l = fst3 $ foldl' step (emptyDFA, B.empty, 1) $ sort $ map B.reverse l where
    step :: (MyDFA , B.ByteString, Int) -> B.ByteString -> (MyDFA , B.ByteString, Int)
    step (dfa, lastWord, newIndex) newWord = (insertNewStates, newWord, newIndex + B.length newSuffix) where            
        (pref, lastSuffix, newSuffix) = splitPrefix lastWord newWord
        branchPoint = transStar dfa pref

        --new state labels for the newSuffix path
        newStates = [newIndex .. newIndex + B.length newSuffix - 1]
        --insert newStates
        insertNewStates = (foldl' (flip insertTransition) dfa $ zip3 (branchPoint:init newStates) (B.unpack newSuffix) newStates)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T15:37:44+00:00Added an answer on May 15, 2026 at 3:37 pm

    Generally speaking, garbage collection is a space/time tradeoff. Give the GC more space, and it will take less time. There are (many) other factors in play, cache in particular, but the space/time tradeoff is the most important one.

    The tradeoff works like this: the program allocates memory until it reaches some limit (decided by the GC’s automatic tuning parameters, or explicitly via RTS options). When the limit is reached, the GC traces all the data that the program is currently using, and reclaims all the memory used by data that is no longer required. The longer you can delay this process, the more data will have become unreachable (“dead”) in the meantime, so the GC avoids tracing that data. The only way to delay a GC is to make more memory available to allocate into; hence more memory equals fewer GCs, equals lower GC overhead. Roughly speaking, GHC’s -H option lets you set a lower bound on the amount of memory used by the GC, so can lower the GC overhead.

    GHC uses a generational GC, which is an optimisation to the basic scheme, in which the heap is divided into two or more generations. Objects are allocated into the “young” generation, and the ones that live long enough get promoted into the “old” generation (in a 2-generation setup). The young generation is collected much more frequently than the old generation, the idea being that “most objects die young”, so young-generation collections are cheap (they don’t trace much data), but they reclaim a lot of memory. Roughly speaking, the -A option sets the size of the young generation – that is, the amount of memory that will be allocated before the young generation is collected.

    The default for -A is 512k: it’s a good idea to keep the young generation smaller than the L2 cache, and performance generally drops if you exceed the L2 cache size. But working in the opposite direction is the GC space/time tradeoff: using a very large young generation size might outweigh the cache benefits by reducing the amount of work the GC has to do. This doesn’t always happen, it depends on the dynamics of the application, which makes it hard for the GC to automatically tune itself. The -H option also increases the young generation size, so can also have an adverse effect on cache usage.

    The bottom line is: play around with the options and see what works. If you have plenty of memory to spare, you might well be able to get a performance increase by using either -A or -H, but not necessarily.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a Haskell program which uses Data.Map.! at several places. After executing the
I'm writing a Haskell program which generates an XML file. Apparently it is considered
I am implementing a haskell program which compares each line of a file with
I have a huge haskell file which compiles and runs without any problem. I
I have a fully functional redact program written in haskell.it replaces all the words
I have a haskell file test.hs. In this file I have written a function
I have a Haskell typeclass question. I can't munge the syntax to get this
Haskell is a pure functional language, which means Haskell functions have no side affects.
I have some troubles with Haskell's type system. Situation: Following program is taking list
I have developed a haskell application which is tested with WinHugs interpreter working fine

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.