Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7016131
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T22:44:40+00:00 2026-05-27T22:44:40+00:00

I am a very new programmer, and I have some trouble with the examples

  • 0

I am a very new programmer, and I have some trouble with the examples from intel. I think it would be helpful if I could see how the most basic possible loop is implemented in tbb.

for (n=0 ; n < songinfo.frames; ++n) {  

         sli[n]=songin[n*2];
         sri[n]=songin[n*2+1];

}

Here is a loop I am using to de-interleave audio data. Would this loop benefit from tbb? How would you implement it?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T22:44:41+00:00Added an answer on May 27, 2026 at 10:44 pm

    First of all for the following code I assume your arrays are of type mytype*, otherwise the code need some modifications. Furthermore I assume that your ranges don’t overlap, otherwise parallelization attemps won’t work correctly (at least not without more work)

    Since you asked for it in tbb:

    First you need to initialize the library somewhere (typically in your main). For the code assume I put a using namespace tbb somewhere.

    int main(int argc, char *argv[]){
       task_scheduler_init init;
       ...
    }
    

    Then you will need a functor which captures your arrays and executes the body of the forloop:

    struct apply_func {
        const mytype* songin; //whatever type you are operating on
        mytype* sli;
        mytype* sri;
        apply_func(const mytype* sin, mytype* sl, mytype* sr):songin(sin), sli(sl), sri(sr)
        {}
        void operator()(const blocked_range<size_t>& range) {
          for(size_t n = range.begin(); n !=range.end(); ++n){
            sli[n]=songin[n*2];
            sri[n]=songin[n*2+1];
          }
        }
    }
    

    Now you can use parallel_for to parallelize this loop:

    size_t grainsize = 1000; //or whatever you decide on (testing required for best performance);
    apply_func func(songin, sli, sri);
    parallel_for(blocked_range<size_t>(0, songinfo.frames, grainsize), func);
    

    That should do it (if I remember correctly haven’t looked at tbb in a while, so there might be small mistakes).
    If you use c++11, you can simplify the code by using lambda:

    size_t grainsize = 1000; //or whatever you decide on (testing required for best performance);
    parallel_for(blocked_range<size_t>(0, songinfo.frames, grainsize), 
                 [&](const blocked_range<size_t>&){
                    for(size_t n = range.begin(); n !=range.end(); ++n){
                      sli[n]=songin[n*2];
                      sri[n]=songin[n*2+1];
                    }
                 });
    

    That being said tbb is not exactly what I would recommend for a new programmer. I would really suggest parallelizing only code which is trivial to parallelize until you have a very firm grip on threading. For this I would suggest using openmp which is quiet a bit simpler to start with then tbb, while still being powerfull enough to parallelize a lot of stuff (Depends on the compiler supporting it,though). For your loop it would look like the following:

    #pragma omp prallel for
    for(size_t n = 0; n < songinfo.frames; ++n) {
      sli[n]=songin[n*2];
      sri[n]=songin[n*2+1];
    }
    

    Then you have to tell your compiler to compile and link with openmp (-fopenmp for gcc, /openmp for visual c++). As you can see it is quite a bit simpler to use (for such easy usecases, more complex scenarious are a different matter) then tbb and has the added benefit of workingon plattforms which don’t support openmp or tbb too (since unknown #pragmas are ignored by the compiler). Personally I’m using openmp in favor of tbb for some projects since I couldn’t use it’s open source license and buying tbb was a bit to steep for the projects.

    Now that we have the how to parallize the loop out of the way, lets get to the question if it’s worth it. This is a question which really can’t be answered easily, since it completely depends on how many elements you process and what kind of platform your program is expected to run on. Your problem is very bandwidth heavy so I wouldn’t count on to much of an increase in performance.

    • If you are only processing 1000 elements the parallel version of the loop is very likely to be slower then the single threaded version due to overhead.
    • If your data is not in the cache (because it doesn’t fit) and your system is very bandwidth starved you might not see much of a benefit (although it’s likely that you will see some benefit, just don’t be supprised if its in the order of 1.X even if you use a lot of processors)
    • If your system is ccNUMA (likely for multisocket systems) your performance might decrease regardless of the amount of elements, due to additional transfercosts
    • The compiler might miss optimizations regarding pointer aliasing (since the loop body is moved to a different dunction). Using __restrict (for gcc, no clue for vs) might help with that problem.
    • …

    Personally I think the situation where you are most likely to see a significant performance increase is if your system has a single multi-core cpu, for which the dataset fit’s into the L3-Cache (but not the individual L2 Caches). For bigger datasets your performance will probably increase, but not by much (and correctly using prefetching might get similar gains). Of course this is pure speculization.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am a very new programmer, I have made a couple basic applications, however
I am a rather new C++ programmer. I have made a very simple game
I'm looking for suggestions on new development system from some programmers that have more
I very new to Python, and fairly new to regex. (I have no Perl
I still very new using Subversion. Is it possible to have a working copy
I am very new to creating webservers - and I have had several goes
I am VERY new to ASP.NET. I come from a VB6 / ASP (classic)
I'm a fairly new programmer and very new to web programming and I need
I am very new to this site and to programming. I started doing some
I am an ASP.NET WebForms programmer and I'm very new to ASP.NET MVC3. I've

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.