This component helps transform a numerical variable. Just select your variable and choose from a number of common transformations or enter your own R code.
The plot in the center shows the variable's density without transformation. Plots located around it show the density of commonly used transformations as well as an optional transformation with custom R code. The red frame indicates which transformation was chosen by the user and will be output in a new column.
Disclaimer
Please note that this component is not an official release by SAP and that it is provided as-is without any guarantee or support. Please test the component to ensure it works for your purposes.
Prerequisites
R libraries e1071 and gplots must be installed.
Limitations
Please let me know should you encounter any limitations.
Usage
These parameters can be set by the user.
Output column added by this component
How to Implement
The component can be downloaded as .spar file from
GitHub. Then deploy it as described
here. You just need to import it through the option "Import/Model Component", which you will find by clicking on the plus-sign at the bottom of the list of the available algorithms.
Example
You can use such a transformation to increase the quality of a linear regression, for instance. The dataset adverts.csv is often used in teaching R. It lists a few companies, their TV advertising budget from 1983 in million dollars (spend) and the retained impressions per week in millions (milimp). You can use the spend to estimate the retained impressions with a linear regression. Taking the logarithm of the spend (instead of the actual spend) improves the quality of the model.