Personal Insights
Parallelization based on distinct entries in column
This example is referenced in my other post about parallelization. Please have a look at the other post to get a better understanding of the context for this example.
How to control parallelization in your model
This example assumes that you have implemented the model described here. You can also download the model here.
Define start node for parallelization
To start parallelization in the Projection node “startParallelization” (1.), go to the Mapping dialog (2.), select the data source (3.) and open the properties (4.). In the properties, select “Partition Local Execution”, and select “CreatedBy” as the “Partition Column” (5.):
start parallelization in node “startParallelization” based on distinct values in column “CreatedBy”
Define stop node for parallelization
To stop parallelization in node “stopParallelization” open the details of the Union node “stopParallelization”, navigate again to the properties of the data source and set the flag for “Partition Local Execution”:
stop parallelization in node “stopParallelization”
With this setting every node between nodes “startParallelization” and “stopParallelization” will be executed in parallel for each distinct value in column “CreatedBy” of the data source table of node “startParallelization”. For simplicity, only one node was inserted between the start and stop nodes of the parallelization block but you can have multiple nodes if required.
Click here to navigate back to the context in which this model is used. You will also find examples there.