Introduction:
When working in ETL and data engineering in
SAP Data Warehouse Data Builder there are several reasons you should want to remove empty strings, for example when constructing
dimensions views usually should avoid items without description since they could duplicate dimension members, take as an example next case:
SKU ID |
Description |
Material |
Color |
112233 |
Connector |
Tin |
blue |
112234 |
|
Tin |
blue |
this scenario is very common in master data coming from other non SAP systems, it could affect seriously your measures and data models, and that's why we will see how to easily remove them.
Prerequisites:
- You have a SAP Data Warehouse Cloud Tenant
- You have your own Space
- You have a table with some fields containing empty strings
Implementing the Filter artifact with the TRIM function:
1. Get into
SAP Data Warehouse Data Builder and open/create a view containing your table with some fields containing empty strings.
2. Select your table and active "Distinct Values: ON" from the displayed menu
3. Add a
Filter artifact
Filter artifact
4. Click or select the
Filter artifact and select the corresponding column to enable it in the
Expression editor:
Column selection
5. Inside the e
xpression editor exclude all null values using:
yourcolumnname IS NOT NULL syntax condition, eventually after removing all spaces from your strings you will effectively remove empty strings using a simple condition:
LTRIM(RTRIM(yourcolumnname ) ) != ' '
6. Validate your expression using the button
Validate and check the expected outcome
Excluding nulls and empties
Conclusion:
Since
SAP Data Warehouse Cloud allows ingesting data practically from every type of source(Systems, DBMS, Files, etc) using this simple method you can exclude null values and empty strings to bring reliability to your data models.
Thank you for reading