Jordans’ Data Cleanse Transformation Basics
By sheer coincidence I bumped into my old mate Jordan the other day. As some of you may remember, I know two Jordans with the same surname, both are friends of mine. One is a six foot five Rastafarian male, the other is a five foot three blonde female. The first Jordan and I did the MCSE together. The second Jordan helped me with my first installation of NT Server over the phone while at the supermarket.
This video by Tahir Hussain Babar, aka Bob, of the SAP HANA Academy looks at the Cleanse Transform within Smart Data Quality in SPS09. The context used to illustrate this function is taking a list of first names and enriching the output to establish the names’ genders.
Bob starts building a new flowgraph within the projects he started in the previous videos. He explains that this will be a basic example and that more advanced examples will follow.
Bob selects his canvas and assigns it to his user before selecting his table of employees as his data source. The next step is to add a Data Provisioning Transform which in this case will be the Cleanse Transform. He adds the output from the Employee table data source to the input of his Cleanse Transform.
Bob explains the 3 general properties tabs which include Input Fields, Output Fields and Settings. These need linking to the right columns for the function to output properly. For example you may have a column called first name that needs matching to smart data quality’s list of semantics for that object which may be called person.
Once the Input Fields have been mapped you need to set the cleansing that you want to perform on your Output Fields which in this case will be gender. By default this will be set to false so won’t be outputted and will need setting to true. This will automatically update the Mappings tab. Remember the Settings are designed to fine tune your cleansing options.
Bob completes the Transform by adding a Template table and configuring the Authoring Schema and Catalog Object as shown.
Once you have saved and activated if no error messages are returned the task has been completed successfully. Before executing Bob opens the Transform in a SQL editor to check that the table structure is sound before executing the Transform.
As you can see the names’ closeness to a specific gender is specifically highlighted unless the name is not in the dictionary in which case it is unassigned. The dictionary can be changed according to your regional locale.
Finally Bob updates the data with his birth name Tahir, truncates the table and runs the task again. The dictionary and associated software is confident that this is a Male name.
There is no doubt that this is a great tool but I bet you’re all wondering which Jordan I bumped into. Jordan is now 56 and looks 25 years younger having been a vegan and martial artist since being a teenager. I could believe what a robust specimen Jordan remains, a definitely strong representative of … gender type.