Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6747231
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T12:25:02+00:00 2026-05-26T12:25:02+00:00

To put the question another way, if one were to try and reimplement OpenGL

  • 0

To put the question another way, if one were to try and reimplement OpenGL or DirectX (or an analogue) using GPGPU (CUDA, OpenCL), where and why would it be slower that the stock implementations on NVIDIA and AMD cards?

I can see how vertex/fragment/geometry/tesselation shaders could be made nice and fast using GPGPU, but what about things like generating the list of fragments to be rendered, clipping, texture sampling and so on?

I’m asking purely for academic interest.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T12:25:03+00:00Added an answer on May 26, 2026 at 12:25 pm

    Modern GPUs have still lots of fixed-function hardware which is hidden from the compute APIS. This includes: The blending stages, the triangle rasterization and a lot of on-chip queues. The shaders of course all map well to CUDA/OpenCL — after all, shaders and the compute languages all use the same part of the GPU — the general purpose shader cores. Think of those units as a bunch of very-wide SIMD CPUs (for instance, a GTX 580 has 16 cores with a 32 wide SIMD unit.)

    You get access to the texture units via shaders though, so there’s no need to implement that in “compute”. If you would, your performance would suck most likely as you don’t get access to the texture caches which are optimized for spatial layout.

    You shouldn’t underestimate the amount of work required for rasterization. This is a major problem, and if you throw all of the GPU at it you get roughly 25% of the raster hardware performance (see: High-Performance Software Rasterization on GPUs.) That includes the blending costs, which are also done by fixed-function units usually.

    Tesselation has also a fixed-function part which is difficult to emulate efficiently, as it amplifies the input up to 1:4096, and you surely don’t want to reserve so much memory up-front.

    Next, you get lots of performance penalties because you don’t have access to framebuffer compression, as there is again dedicated hardware for this which is “hidden” from you when you’re in compute only mode. Finally, as you don’t have any on-chip queues, it will be difficult to reach the same utility ratio as the “graphics pipeline” gets (for instance, it can easily buffer output from vertex shaders depending on shader load, you can’t switch shaders that flexibly.)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I feel my question is pretty dumb, or another way to put it is
I asked this question in another post - How do I put a <div>
Another clipboard question: When text is put onto the clipboard, it frequently goes in
The question is simple and to put it in just one line : The
Bit of a weird question. I'm passing data from one activity to another -
How can I put a question mark above a less-than-or-equal-to symbol( \leq ) in
I have two question to put forward: I was very interested, even intrigued by
The question is : How do I manage to put a listener on elements
I put better in quotes because it's a qualitative question. I've been writing COM
Just a quick question: Say I put an insert trigger on a table in

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.