Using the Hierarchy Flattening Transform
The hierarchy flattening transform can analyze a parent-child relationship and provide a description of the hierarchy in a flattened format (vertically flattened/ horizontally flattened). Each row in the output will contain 1 parent-child relationship.
For example consider the following reporting structure in an organization:
The input data is in a table EMP_MGR. The data is corresponding to the hierarchy structure in the above diagram.The below figure shows the data in the table EMP_MGR.
This data is taken as the input to the Hierarchy Flattening Transform. In this example we will see the horizontal flattening by Hierarchy Flattening Transform. This will mean that each root to node relationship will be “flattened” and presented as a row, eg: MGR_1 : EMP_2, MGR_1 : MGR_2, MGR_1 : MGR_2:EMP_4 etc. The figure below shows the Dataflow containing the hierarchy transform.
We will configure the Hierarchy Flattening Transform such that the MGR_ID as the parent column and EMP_ID as the Child column. Since this example deals with only 2 columns (representing Parent and Child) we will not be populating the Parent attribute List and the Child Attribute List. If we had additional columns in the source table , that are associated with the Parent column, we could have added them to the Parent attribute List. Similarly , if we had additional columns in the source table , that are associated with the Child column, we could have added them to the Child attribute List. Configure the Input schema columns only. After that , if you click on “Validate” button, the right hand columns are created automatically. The number of columns (…upto LEVEL 3 in this snapshot) depend on the maximum depth that is selected.The figure below shows the configuration of the hierarchy flattening transform.
Maximum depth is taken as 3 in the initial run. The root node (MGR_1) has zero level depth, MGR_2 has depth 1 , EMP_4 has depth 2 and so on. So having a maximum depth =3 will mean that we cannot fully represent EMP_6 which has a depth =4. This is illustrated when we run the Job and look at the output.
We see that the last row is incomplete.
If we reset Maximum Depth= 4 and run the job again then we get the output as shown below . Here the last row shows the root to node relationship completely for EMP_6.
If the input data is erroneous, that is there exists cyclic relationship, then a run-time error is produced.