Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3681176
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 19, 20262026-05-19T03:39:43+00:00 2026-05-19T03:39:43+00:00

The question may seem vague, but let me explain it. Suppose we have a

  • 0

The question may seem vague, but let me explain it.

Suppose we have a function f(x,y,z ….) and we need to find its value at the point (x1,y1,z1 …..).

The most trivial approach is to just replace (x,y,z …) with (x1,y1,z1 …..).

Now suppose that the function is taking a lot of time in evaluation and I want to parallelize the algorithm to evaluate it. Obviously it will depend on the nature of function, too.

So my question is: what are the constraints that I have to look for while “thinking” to parallelize f(x,y,z…)?

If possible, please share links to study.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-19T03:39:43+00:00Added an answer on May 19, 2026 at 3:39 am

    Asking the question in such a general way does not permit very specific advice to be given.

    I’d begin the analysis by looking for ways to evaluate or rewrite the function using groups of variables that interact closely, creating intermediate expressions that can be used to make the final evaluation. You may find a way to do this involving a hierarchy of subexpressions that leads from the variables themselves to the final function.

    In general the shorter and wider such an evaluation tree is, the greater the degree of parallelism. There are two cautionary notes to keep in mind that detract from “more parallelism is better.”

    For one thing a highly parallel approach may actually involve more total computation than your original “serial” approach. In fact some loss of efficiency in this regard is to be expected, since a serial approach can take advantage of all prior subexpression evaluations and maximize their reuse.

    For another thing the parallel evaluation will often have worse rounding/accuracy behavior than a serial evaluation chosen to give good or optimal error estimates.

    A lot of work has been done on evaluations that involve matrices, where there is usually a lot of symmetry to how the function value depends on its arguments. So it helps to be familiar with numerical linear algebra and parallel algorithms that have been developed there.

    Another area where a lot is known is for multivariate polynomial and rational functions.

    When the function is transcendental, one might hope for some transformations or refactoring that makes the dependence more tractable (algebraic).

    Not directly relevant to your question are algorithms that amortize the cost of computing function values across a number of arguments. For example in computing solutions to ordinary differential equations, there may be “multi-step” methods that share the cost of evaluating derivatives at intermediate points by reusing those values several times.

    I’d suggest that your concern to speed up the evaluation of the function suggests that you plan to perform more than one evaluation. So you might think about ways to take advantage of prior evaluations or perform evaluations at related arguments in a way that contributes to your search for parallelism.

    Added: Some links and discussion of search strategy

    Most authors use the phrase “parallel function evaluation” to
    mean evaluating the same function at multiple argument points.

    See for example:

    [Coarse Grained Parallel Function Evaluation — Rulon and Youssef]
    http://cdsweb.cern.ch/record/401028/files/p837.pdf

    A search strategy to find the kind of material Gaurav Kalra asks
    about should try to avoid those. For example, we might include
    “fine-grained” in our search terms.

    It’s also effective to focus on specific kinds of functions, e.g.
    “polynomial evaluation” rather than “function evaluation”.

    Here for example we have a treatment of some well-known techniques
    for “fast” evaluations applied to design for GPU-based computation:

    [How to obtain efficient GPU kernels — Cruz, Layton, and Barba]
    http://arxiv.org/PS_cache/arxiv/pdf/1009/1009.3457v1.pdf

    (from their Abstract) “Here, we have tackled fast summation
    algorithms (fast multipole method and fast Gauss transform),
    and applied algorithmic redesign for attaining performance on
    GPUs. The progression of performance improvements attained
    illustrates the exercise of formulating algorithms for the
    massively parallel architecture of the GPU.”

    Another search term that might be worth excluding is “pipelined”.
    This term invariably discusses the sort of parallelism that can
    be used when multiple function evaluations are to be done. Early
    stages of the computation can be done in parallel with later
    stages, but on different inputs.

    So that’s a search term that one might want to exclude. Or not.

    Here’s a paper that discusses n-fold speedup for n-variate
    polynomial evaluation over finite fields GF(p). This might be
    of direct interest for cryptographic applications, but the
    approach via modified Horner’s method may be interesting for
    its potential for generalization:

    [Comparison of Bit and Word Level Algorithms for Evaluating
    Unstructured Functions over Finite Rings — Sunar and Cyganski]
    http://www.iacr.org/archive/ches2005/018.pdf

    “We present a modification to Horner’s algorithm for evaluating
    arbitrary n-variate functions defined over finite rings and fields.
    … If the domain is a finite field GF(p) the complexity of
    multivariate Horner polynomial evaluation is improved from O(p^n)
    to O((p^n)/(2n)). We prove the optimality of the presented algorithm.”

    Multivariate rational functions can be considered simply as the
    ratio of two such polynomial functions. For the special case
    of univariate rational functions, which can be particularly
    effective in approximating elementary transcendental functions
    and others, can be evaluated via finite (resp. truncated)
    continued fractions, whose convergents (partial numerators
    and denominators) can be defined recursively.

    The topic of continued fraction evaluations allows us to segue
    to a final link that connects that topic with some familiar
    parallelism of numerical linear algebra:

    [LU Factorization and Parallel Evaluation of Continued Fractions
    — Ömer Egecioglu]
    http://www.cs.ucsb.edu/~omer/DOWNLOADABLE/lu-cf98.pdf

    “The first n convergents of a general continued fraction
    (CF) can be computed optimally in logarithmic parallel
    time using O(n/log(n))processors.”

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This question may seem a little bit stackoverflow-implementation specific, but I have seen a
Excuse me if my question may seem inappropriate but i was unable to find
This may seem like an odd question, but I have my own reasons for
At first look, my question may seem a bit vague. But I'll try to
First, let me apologize for asking a question that may seem a bit vague
This may have been already asked but I can't seem to find this specific
This question may seem trivial, but I hope you won't ignore it. Before destroying
I have another Richfaces question which may seem rather weird. I am developing a
This may seem like a simple question but i am getting an error when
This may seem like a stupid question, but what message do i send to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.