I have batch job which reads data from bulk files, process it and insert in DB.
I’m using spring’s partitioning features using the default partition handler.
<bean class="org.spr...TaskExecutorPartitionHandler">
<property name="taskExecutor" ref="taskExecutor"/>
<property name="step" ref="readFromFile" />
<property name="gridSize" value="10" />
</bean>
What is the significance of the gridSize here ? I have configured in such a way that it is equal to the concurrency in taskExecutor.
gridSizespecifies the number ofdata blocksto create to be processed by (usually) the same number ofworkers. Think about it as a number of mapped data blocks in a map/reduce.Using a
StepExecutionSplitter, given the data,PartitionHandler“partitions” / splits the data to agridSizeparts, and sends each part to an independent worker =>threadin your case.For example, you have 10 rows in DB that need to be processed. If you set the
gridSizeto be 5, and you are using a straightforward partition logic, you’d end up with 10 / 5 = 2 rows per thread => 5 threads working concurrently on 2 rows each.