Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8774633
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T18:37:03+00:00 2026-06-13T18:37:03+00:00

Short version : how to most efficiently represent and add two random variables given

  • 0

Short version: how to most efficiently represent and add two random variables given by lists of their realizations?

Mildly longer version:
for a workproject, I need to add several random variables each of which is given by a list of values. For example, the realizations of rand. var. A are {1,2,3} and the realizations of B are {5,6,7}. Hence, what I need is the distribution of A+B, i.e. {1+5,1+6,1+7,2+5,2+6,2+7,3+5,3+6,3+7}. And I need to do this kind of adding several times (let’s denote this number of additions as COUNT, where COUNT might reach 720) for different random variables (C, D, …).

The problem: if I use this stupid algorithm of summing each realization of A with each realization of B, the complexity is exponential in COUNT. Hence, for the case where each r.v. is given by three values, the amount of calculations for COUNT=720 is 3^720 ~ 3.36xe^343 which will last till the end of our days to calculate:) Not to mention that in real life, the lenght of each r.v. is gonna be 5000+.

Solutions:
1/ The first solution is to use the fact that I am OK with rounding, i.e. having integer values of realizations. Like this, I can represent each r.v. as a vector and for at the index corresponding to a realization I have a value of 1 (when the r.v. has this realization once). So for a r.v. A and a vector of realizations indexed from 0 to 10, the vector representing A would be [0,1,1,1,0,0,0…] and the representation for B would be [0,0,0,0,0,1,1,1,0,0,10]. Now I create A+B by going through these vectors and do the same thing as above (sum each realization of A with each realization of B and codify it into the same vector structure, quadratic complexity in vector length). The upside of this approach is that the complexity is bound. The problem of this approach is that in real applications, the realizations of A will be in the interval [-50000,50000] with a granularity of 1. Hence, after adding two random variables, the span of A+B gets to -100K, 100K.. and after 720 additions, the span of SUM(A, B, …) gets to [-36M, 36M] and even quadratic complexity (compared to exponential complexity) on arrays this large will take forever.

2/ To have shorter arrays, one could possibly use a hashmap, which would most likely reduce the number of operations (array accesses) involved in A+B as the assumption is that some non-trivial portion of the theoreical span [-50K, 50K] will never be a realization. However, with continuing summing of more and more random variables, the number of realizations increases exponentially while the span increases only linearly, hence the density of numbers in the span increases over time. And this would kill the hashmap’s benefits.

So the question is: how can I do this problem efficiently? The solution is needed for calculating a VaR in electricity trading where all distributions are given empirically and are like no ordinary distributions, hence formulas are of no use, we can only simulate.


Using math was considered as the first option as half of our dept. are mathematicians. However, the distributions that we’re going to add are badly behaved and the COUNT=720 is an extreme. More likely, we are going to use COUNT=24 for a daily VaR. Taking into account the bad behaviour of distributions to add, for COUNT=24 the central limit theorem would not hold too closely (the distro of SUM(A1, A2, …, A24) would not be close to normal). As we’re calculating possible risks, we’d like to get a number as precise as possible.

The intended use is this: you have hourly casflows from some operation. The distribution of cashflows for one hour is the r.v. A. For the next hour, it’s r.v. B, etc. And your question is: what is the largest loss in 99 percent of cases? So you model the cashflows for each of those 24 hours and add these cashflows as random variables so as to get a distribution of the total casfhlow over the whole day. Then you take the 0.01 quantile.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T18:37:04+00:00Added an answer on June 13, 2026 at 6:37 pm

    There are basically two methods. An approximative one and an exact one…

    Approximative method models the sum of random variables by a lot of samplings. Basically, having random variables A, B we randomly sample from each r.v. 50K times, add the sampled values (here SSE can help a lot) and we have a distribution of A+B. This is how mathematicians would do this in Mathematica.

    Exact method utilizes something Dan Puzey proposed, namely summing only some small portion of each r.v.’s density. Let’s say we have random variables with the following “densities” (where each value is of the same likelihood for simplicity sake)

    A = {-5,-3,-2}
    B = {+0,+1,+2}
    C = {+7,+8,+9}
    

    The sum of A+B+C is going to be

    {2,3,3,4,4,4,4,5,5,5,5,5,6,6,6,6,6,6,7,7,7,7,7,8,8,8,9}
    

    and if I want to know the whole distribution precisely, I have no other choice than summing each elem of A with each elem of B and then each elem of this sum with each elem of C. However, if I only want the 99% VaR of this sum, i.e. 1% percentile of this sum, I only have to sum the smallest elements of A,B,C.

    More precisely, I will take nA,nB,nC smallest elements from each distribution. To determine nA,nB,nC let’s set these to 1 first. Then, increase nA by one if A[nA] = min( A[nA], B[nB], C[nC]) (counting on that A,B,C are sorted). This way, I can get the nA, nB, nC smallest elements of A,B,C which I will have to sum together (each with each other) and take the X-th smallest sum (where X is 1% multiplied by total combination count of sums, i.e. 3*3*3 for A,B,C). This also tells when to stop increasing nA,nB,nC – stop when nA*nB*nC > X.

    However, like this I am doing the same redundancy again, i.e. I am calculating the whole distribution of A+B+C left of the 1% percentile. Even this will be MUCH shorter than calculating the whole distro of A+B+C, however. But I believe there should be a simple iterative algo to tell exaclty the the given VaR number in O(a*b) where a is the number of added r.v.s and b is the max number of elements in the density of each r.v.

    I will be glad for any comments on whether I am correct.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Short version: How can I map two columns from table A and B if
Short version: What is the cleanest and most maintainable technique for consistant presentation and
Short version: Can I grant access to external databases to a role? Long version:
Short version : I would like the maven-glassfish-plugin to only be executed in the
Short Version: I'm looking for a technique to keep nearly-sorted data in nearly-sorted order
Short version: In pure Java EE 6, is there something like Spring's Authentication Processing
Short version : echo testing | vim - | grep good This doesn't work
Short version: I have a similar setup to StackOverflow. Users get Achievements. I have
Short version: Is it easy/feasible/possible to program modal window in Flash (AS3)? Is there
Short version: I'm trying to determine the best way to track what the user

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.