Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4231530
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 21, 20262026-05-21T01:58:49+00:00 2026-05-21T01:58:49+00:00

I am building an application based on distributed linear algebra using Trilinos , the

  • 0

I am building an application based on distributed linear algebra using Trilinos, the main issue is that memory consumption is much higher than expected.

I have built a simple test case for building an Epetra::VbrMatrix with 1.5 million doubles grouped as 5 millions blocks of 3 doubles, which should be about 115MB.

After building the matrix on 2 processors, half data each, I get a memory consumption of 500MB on each processor, which is about 7.5 times the data, it looks unreasonable to me, the matrix should just have some integer arrays for locating the nonzero blocks.

I asked on the trilinos-users mailing list, they say memory usage looks too high, but hope to have some more help here.

I tested both on my laptop with Ubuntu + gcc 4.4.5 + Trilinos 10.0 and on a cluster with PGI compiler and Trilinos 10.4.0, the result is about the same.

My test code is on gist https://gist.github.com/848310, where I also wrote memory consumption at different stage in my testing with 2 MPI processes on my laptop.

If anybody has any suggestion that would be really helpful. Also if you could even just build, run and report memory consumption it would be great.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-21T01:58:50+00:00Added an answer on May 21, 2026 at 1:58 am

    answer by Alan Williams form the trilinos-users list, in short VBRmatrix is not suitable for such small blocks, as the storage overhead is bigger than the data themselves:

    The VbrMatrix storage format
    definitely incurs some storage
    overhead, as compared to the simple
    number of double-precision values
    being stored.

    In your program, you are storing
    5,000,000 X 1 X 3 == 15 million
    doubles. With 8 bytes per double, that
    is 120 million bytes.

    The matrix class Epetra_VbrMatrix
    (which is the base class for
    Epetra_FEVbrMatrix) internally stores
    a Epetra_CrsGraph object, which
    represents the sparsity structure.
    This requires a couple of integers per
    block-row, and 1 integer per
    block-nonzero. (Your case has 5
    million block-rows with 1
    block-nonzero per row, so at least 15
    million integers in total.)

    Additionally, the Epetra_VbrMatrix
    class stores a
    Epetra_SerialDenseMatrix object for
    each block-nonzero. This adds a couple
    of integers, plus a bool and a
    pointer, for each block-nonzero. In
    your case, since your block-nonzeros
    are small (1×3 doubles), this is a
    substantial overhead. The VbrMatrix
    format has proportionally less
    overhead the bigger your
    block-nonzeros are. But in your case,
    with 1×3 blocks, the VbrMatrix is
    indeed occupying several times more
    memory than is required for the
    15million doubles.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am building an application based on the Zend Framework, and my issue is
I am building an indexer application based on a suffix tree, that enables me
I am building an application that is based on MVVM-Light. I am in the
I am building application that required some data from iPhone's Call log(read only). The
I'm building an application that is targeting Windows, Mac and Linux soon. I was
I'm building an application in C# using WPF. How can I bind to some
We're currently building an application that executes a number of external tools. We often
I am considering building an application that is a blend of a dynamic language
I'm building an application based on the Utility template from Xcode, to which I
I'm building a new ASP.NET web application based on a legacy one (Classic ASP).

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.