I am writing an image processing program and am trying to implement gpgpu and opencl using OpenCLTemplate. After working through your tutorials I have figured out how to get my code to run but i am having issues with larger images. I fixed this by separating the images into chunks and running it through my code but i want to make it so that instead of having a set amount of chunks to split into i have it determine the max amount of memory needed and split the image into however many pieces to run the code.
The problem i have run into though is that i don’t know exactly how much memory is being thrown at the gpu and how to figure it out. Below is the code i am using, is it possible to explain how the memory is being handled here or some advice as to where to look.
I’ve looked through the opencltemplate documentation to no avail and no longer know where to look.
CLCalc.Program.Compile(openCLInvert);
CLCalc.Program.Kernel kernel = new CLCalc.Program.Kernel("Filter");
CLCalc.Program.Variable CLData = new CLCalc.Program.Variable(Data);
float[] imgProcessed = new float[Data.Length];
CLCalc.Program.Variable CLFiltered = new CLCalc.Program.Variable(imgProcessed);
CLCalc.Program.Variable[] args =
new CLCalc.Program.Variable[] { CLData, CLFiltered };
int[] test = new int[] { imageData.Width, imageData.Height };
float size = 0;
for (int x = 0; x <= 1; x++)
{
size += args[x].Size;
}
kernel.Execute(args, test);
CLCalc.Program.Sync();
As above shows i can find the amount of size being used as the argument but i still don’t know what the total memory usages are.
The amount of memory that is being used by your application would be the size of the
Datavariable plus the size of theimgProcessedvariable.Therefore the amount of data being used by your application would therefore be:
This is because all of your data is being allocated on the device when you make calls like
In this case, the entire Data array is being written to your device memory.