Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8446227
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T09:52:26+00:00 2026-06-10T09:52:26+00:00

I am trying to write an OpenGL wrapper that will allow me to use

  • 0

I am trying to write an OpenGL wrapper that will allow me to use all of my existing graphics code (written for OpenGL) and will route the OpenGL calls to Direct3D equivalents. This has worked surprisingly well so far, except performance is turning out to be quite a problem.

Now, I admit I am most likely using D3D in a way it was never designed. I am updating a single vertex buffer thousands of times per render loop. Every time I draw a “sprite” I send 4 vertices to the GPU with texture coordinates, etc and when the number of “sprites” on the screen at one time gets to around 1k to 1.5k, then the FPS of my app drops to below 10fps.

Using the VS2012 Performance Analysis (which is awesome, btw), I can see that the ID3D11DeviceContext->Draw method is taking up the bulk of the time:
Screenshot Here

Is there some setting I’m not using correctly while setting up my vertex buffer, or during the draw method? Is it really, really bad to be using the same vertex buffer for all of my sprites? If so, what other options do I have that wouldn’t drastically alter the architecture of my existing graphics code base (which are built around the OpenGL paradigm…send EVERYTHING to the GPU every frame!)

The biggest FPS killer in my game is when I’m displaying a lot of text on the screen. Each character is a textured quad, and each one requires a separate update to the vertex buffer and a separate call to Draw. If D3D or hardware doesn’t like many calls to Draw, then how else can you draw a lot of text to the screen at one time?

Let me know if there is any more code you’d like to see to help me diagnose this problem.

Thanks!

Here’s the hardware I’m running on:

  • Core i7 @ 3.5GHz
  • 16 gigs of RAM
  • GeForce GTX 560 Ti

And here’s the software I’m running:

  • Windows 8 Release Preview
  • VS 2012
  • DirectX 11

Here is the draw method:

void OpenGL::Draw(const std::vector<OpenGLVertex>& vertices)
{
   auto matrix = *_matrices.top();
   _constantBufferData.view = DirectX::XMMatrixTranspose(matrix);
   _context->UpdateSubresource(_constantBuffer, 0, NULL, &_constantBufferData, 0, 0);

   _context->IASetInputLayout(_inputLayout);
   _context->VSSetShader(_vertexShader, nullptr, 0);
   _context->VSSetConstantBuffers(0, 1, &_constantBuffer);

   D3D11_PRIMITIVE_TOPOLOGY topology = D3D11_PRIMITIVE_TOPOLOGY_TRIANGLESTRIP;
   ID3D11ShaderResourceView* texture = _textures[_currentTextureId];

   // Set shader texture resource in the pixel shader.
   _context->PSSetShader(_pixelShaderTexture, nullptr, 0);
   _context->PSSetShaderResources(0, 1, &texture);

   D3D11_MAPPED_SUBRESOURCE mappedResource;
   D3D11_MAP mapType = D3D11_MAP::D3D11_MAP_WRITE_DISCARD;
   auto hr = _context->Map(_vertexBuffer, 0, mapType, 0, &mappedResource);
   if (SUCCEEDED(hr))
   {
      OpenGLVertex *pData = reinterpret_cast<OpenGLVertex *>(mappedResource.pData);
      memcpy(&(pData[_currentVertex]), &vertices[0], sizeof(OpenGLVertex) * vertices.size());
      _context->Unmap(_vertexBuffer, 0);
   }

   UINT stride = sizeof(OpenGLVertex);
   UINT offset = 0;
   _context->IASetVertexBuffers(0, 1, &_vertexBuffer, &stride, &offset);
   _context->IASetPrimitiveTopology(topology);
   _context->Draw(vertices.size(), _currentVertex);
   _currentVertex += (int)vertices.size();
}

And here is the method that creates the vertex buffer:

void OpenGL::CreateVertexBuffer()
{
   D3D11_BUFFER_DESC bd;
   ZeroMemory(&bd, sizeof(bd));
   bd.Usage = D3D11_USAGE_DYNAMIC;
   bd.ByteWidth = _maxVertices * sizeof(OpenGLVertex);
   bd.BindFlags = D3D11_BIND_VERTEX_BUFFER;
   bd.CPUAccessFlags = D3D11_CPU_ACCESS_FLAG::D3D11_CPU_ACCESS_WRITE;
   bd.MiscFlags = 0;
   bd.StructureByteStride = 0;
   D3D11_SUBRESOURCE_DATA initData;
   ZeroMemory(&initData, sizeof(initData));
   _device->CreateBuffer(&bd, NULL, &_vertexBuffer);
}

Here is my vertex shader code:

cbuffer ModelViewProjectionConstantBuffer : register(b0)
{
    matrix model;
    matrix view;
    matrix projection;
};

struct VertexShaderInput
{
    float3 pos : POSITION;
    float4 color : COLOR0;
    float2 tex : TEXCOORD0;
};

struct VertexShaderOutput
{
    float4 pos : SV_POSITION;
    float4 color : COLOR0;
    float2 tex : TEXCOORD0;
};

VertexShaderOutput main(VertexShaderInput input)
{
    VertexShaderOutput output;
    float4 pos = float4(input.pos, 1.0f);

    // Transform the vertex position into projected space.
    pos = mul(pos, model);
    pos = mul(pos, view);
    pos = mul(pos, projection);
    output.pos = pos;

    // Pass through the color without modification.
    output.color = input.color;
    output.tex = input.tex;

    return output;
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T09:52:27+00:00Added an answer on June 10, 2026 at 9:52 am

    What you need to do is batch vertexes as aggressively as possible, then draw in large chunks. I’ve had very good luck retrofitting this into old immediate-mode OpenGL games. Unfortunately, it’s kind of a pain to do.

    The simplest conceptual solution is to use some sort of device state (which you’re probably tracking already) to create a unique stamp for a particular set of vertexes. Something like blend modes and bound textures is a good set. If you can find a fast hashing algorithm to run on the struct that’s in, you can store it pretty efficiently.

    Next, you need to do the vertex caching. There are two ways to handle that, both with advantages. The most aggressive, most complicated, and in the case of many sets of vertexes with similar properties, most efficient is to make a struct of device states, allocate a large (say 4KB) buffer, and proceed to store vertexes with matching states in that array. You can then dump the entire array into a vertex buffer at the end of the frame, and draw chunks of the buffer (to recreate original order). Keeping track of all the buffer and state and order is difficult, however.

    The simpler method, which can provide a good bit of caching under good circumstances, is to cache vertexes in a large buffer until device state changes. At that point, prior to actually changing state, dump the array into a vertex buffer and draw. Then reset the array index, commit state changes, and go again.

    If your application has large numbers of similar vertexes, which is very possible working with sprites (texture coordinates and colors may change, but good sprites will use a single texture atlas and few blending modes), even the second method can give some performance boosts.

    The trick here is to build up a cache in system memory, preferably a large chunk of pre-allocated memory, then dump it to video memory just prior to drawing. This allows you to perform far fewer writes to video memory and draw calls, which tend to be expensive (especially together). As you’ve seen, the number of calls you make gets to be slow, and batching stands a good chance of helping with that. The trick is to not allocate memory each frame if you can help it, batch large enough chunks to be worthwhile, and maintain correct device state and order for each draw.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Trying to write app for service technicians that will display open service calls within
I m trying write code that after reset set up rrpmax as 3000. It
Trying to write a couple of functions that will encrypt or decrypt a file
Trying to write a code at the moment that basically tests to see if
I am trying to write an optimized code that renders a 3D scene using
I'm trying to use the transform feedback functionality of OpenGL. I've written a minimalistic
Trying to write .ply parser to use .ply models in OpenGL. Trying to begin
I'm trying to write an OpenGL program to manipulate the camera. However the code
I'm trying to use Visual Studio Express 2010 to write an openGL program, so
I'm trying to write a simple app that will do a screen capture of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.