It seems apparent that each core of the GPU could allow for handling of a request, rather than one main processor (the system’s CPU) handling all requests. On the surface, it seems like it is possible, perhaps with Templates in GPU + Redis database in GPU GDDR5?
Is it possible and worthwhile?
How would the GPU access disks, databases, etc.?
Requests are usually short sharp processing snippets. You’d have to load each request off main memory, into GPU memory, do a computation and fire it back again. There’s an overhead when transferring data from main memory to GPU memory. Therefore, it’s only worth doing a GPU computation if the calculation is long enough and the problem is ammenable to parallel processing on a GPU.
In essence, GPUs are good at stream processing. Not for lots of small requests.