Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6184499
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T01:32:46+00:00 2026-05-24T01:32:46+00:00

I’m investigating as to whether there is a framework/library that will help me implement

  • 0

I’m investigating as to whether there is a framework/library that will help me implement a distributed computing system.

I have a master that has a large amount of data split up into files of a few hundred megabytes. The files would be chunked up into ~1MB pieces and distributed to workers for processing. Once initialized, the processing on each worker is dependent on state information obtained from the previous chunk, so workers must stay alive throughout the entire process, and the master needs to be able to send the right chunks to the right workers. One other thing to note is that this system is only a piece of a larger processing chain.

I did a little bit of looking into MPI (specifically Open MPI), but I’m not sure if it is the right fit. It seems to be geared to sending small messages (a few bytes), though I did find some charts that show it’s throughput increases with larger files (up to 1/5 MB).

I’m concerned that there might not be a way to maintain the state unless it was constantly sent back and forth in messages. Looking at the structure of some MPI examples, it looked like master (rank 0) and workers (ranks 1-n) were a part of the same piece of and their actions were determined by conditionals. Can I have the workers stay alive (maintaining state) and wait for more messages to arrive?

Now that I’m writing this I’m thinking it would work. The rank 1…n section would just be a loop with a blocking receive followed by the processing code. The state would be maintained in that loop until a “no more data” message was received at which point it would send back the results. I might be beginning to grasp the MPI structure here…

My other question about MPI is how to actually run the code. Remember that this system is part of a larger system, so it needs to be called from some other code. The examples I’ve seen make use of mpirun, with which you can specify how the number of processors, or a hosts file. Can I get the same behavior by calling my MPI function from other code?

So my question is is MPI the right framework here? Is there something better suited to this task, or am I going to be doing this from scratch?

  • 1 1 Answer
  • 3 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T01:32:47+00:00Added an answer on May 24, 2026 at 1:32 am

    MPI seems reasonable option for your task. It uses the SPMD architecture, meaning you have the same program executing simultaneously on possibly distributed or even heterogeneous system. So the choice of process with rank 0 being the master and others being the workers is not mandatory, you can choose other patterns.

    If you want to provide state for your application, you can use a constantly living MPI application and master process sending commands to worker ones over time. You probably should also consider saving that state to disk in order to provide more robustness to failures.

    Running of an MPI process is done initially by mpirun. For example, you create some program program.c, then compile it using mpicc -o program program.c. Then you have to run mpirun -np 20 ./program <params> to run 20 processes. You will have 20 independent processes each having its own rank, so further progress is upon your application. The way these 20 processes will be distributed among nodes/processors is controlled by things like hostfile etc, should look at the documentation more closely.

    If you want your code to be reusable, i.e. runnable from another MPI program, you generally should at least learn what MPI Communicator is and how to create/use one. There’re articles on the net, keywords being “Creating MPI library”.

    If the code using your library is not to be in MPI itself, it’s no huge problem, your program in MPI is not limited to MPI in communication. It just should communicate inside it’s logic through MPI. You can call any program using mpirun, unless it tries calls to MPI library, it won’t notice that it’s being run under MPI.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I have a French site that I want to parse, but am running into
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I need a function that will clean a strings' special characters. I do NOT
I'm working with an upstream system that sometimes sends me text destined for HTML/XML
link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I have just tried to save a simple *.rtf file with some websites and
I have a jquery bug and I've been looking for hours now, I can't
this is what i have right now Drawing an RSS feed into the php,

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.