Data Mining using APriori Algorithm in XI Part III
The Support Calculation process is the second phase of the APriori algorithm, which is done after the Candidate Generation process ( Data Mining using APriori Algorithm in XI Part II ).
This process counts the Support for every candidate that is generated in the Candidate Generation process.
So, this process needs two input files. One is the output file of Candidate Generation process and other is the initial input file. The first file has the following XSD structure,
The second input file has the following XSD structure,
The support is calculated by the number of times the candidate set has occurred in the itemset. So the output file has the candidates and their corresponding support counts.
The output file has the following XSD structure,
In the message mapping, the source message has two message types.
Next we will look at Message Mapping. In this, both the input files are given as input to the User-Defined function, which will generate the candidate sets and their support counts separately.
In this mapping, we have to change the context of ITEMSET to FC_Input_MT. This is shown in the following picture.
Next is the User-Defined function, which will generate the support counts for every candidate.
The User-Defined Function
The Integration Process
In this integration process, the Fork node has two branches. In every branch, there is one Receive node. The Receive nodes in two branches are responsible for getting the input files. The Fork node gets terminated when the two input files are fetched. Then the mapping is done by the Transformation and the output is sent through the Send node.
Finally, do all the necessary configuration settings for this Support Calculation process.
The Input File
The Output File