Text Data Processing Transform – Going from Unstructured to Structured
This was done using Data Services 4.0 SP1
Text data processing takes unstructured data and formats it into a table (as an example) to ease reporting. This example takes e-mails and formats them into a table.
First create a batch job and a data flow.
Create a source from the flat files in the Local Object library
Then select the unstructured text type
Select the e-mails as a file input (this is the unstructured data)
Then from the transforms tab, select Entity_Extraction > Base_EntityExtraction
Then in the input schema pane we’ll drag Data column into the Text column
In the options tab select ENGLISH as the language
In the output tab place checkboxes next the to the fields below
Then drag FileName from the Schema In to the Schema Out as shown below
Then add a table to the output table type (which you can use to report from)
After executing the batch job, you can view the results of the table
Now you can use this to report on unstructured data.