I’m processing images using TPL Dataflow. I receive a processing request, read an image from a stream, apply several transformations, then write the resulting image to another stream:
Request -> Stream -> Image -> Image ... -> Stream
For that I use the blocks:
BufferBlock<Request>
TransformBlock<Request,Stream>
TransformBlock<Stream,Image>
TransformBlock<Image,Image>
TransformBlock<Image,Image>
...
writerBlock = new ActionBlock<Image>
The problem is the initial Request is what contains some data necessary to create the resulting Stream along with some additional info I need at that point. Do I have to pass the original Request (or some other context object) down the line to the writerBlock across all the other blocks like this:
TransformBlock<Request,Tuple<Request,Stream>>
TransformBlock<Tuple<Request,Stream>,Tuple<Request,Image>>
TransformBlock<Tuple<Request,Image>,Tuple<Request,Image>>
...
(which is ugly), or is there a way to link the first block to the last one (or, generalizing, to the ones that need the additional data)?
Yes, you pretty much need to do what you described, passing the additional data from every block to the next one.
But using a couple of helper methods, you can make this much simpler:
The signatures look daunting, but they are actually not that bad.
Also, you might want to add overloads that pass options to the created block, or overloads that take async delegates.
For example, if you wanted to perform some operations on a number using separate blocks, while passing the original number along the way, you could do something like:
As you can see, your lambdas (except for the last one) deal only with the “current” value (
int,doubleorstring, depending on the stage of the pipeline), the “original” value (alwaysint) is passed automatically. At any moment, you can use block created using the normal constructor to access both values (like the finalActionBlockin the example).(That
BufferBlockisn’t actually necessary, but I added it to more closely match your design.)