Skip to Content

A strict naming schema for all DS objects (projects, jobs, workflows, dataflows, datastores, file formats, custom functions) is essential when working in a multi-user environment. The central repository has no folder concept or hierarchy or grouping functionality. The only way to distinguish between objects from one grouping and another one is by name. Most effective approach for naming objects is based on prefixing.

Note: In order to display the full names of DS objects in the Designer workspace, increase the icon name length. You do this by selecting Tools –> Options from the Designer menu, then expand Designer and select General. In this window, specify the value of 64 in the “Number of characters in workspace icon name” box.

/wp-content/uploads/2014/01/1_363033.png

General note: Versioning should not be handled by naming conventions. So, never include a version number in an object name. Use the central repository concept for maintaining successive versions of any object.

1.  Reusable objects

Object Naming Convention Example
Project <project_name> BI4B
Job <project_name>_<job_name> BI4B_D
Workflow contained in one job only

<project_name>_<job_name>_[XT|TF|LD|AG…]

<project_name>_<job_name>_[XT|TF|LD|AG…]_<workflow_name>

<project_name>_<job_name>_<workflow_name>

BI4B_D_XT

BI4B_D_LD_Facts

BI4B_D_Init

Workflow that is reused

<project_name>_COMMN_[XT|TF|LD|AG…]_<workflow_name>

COMMN_[XT|TF|LD|AG…]_<workflow_name>

BI4B_COMMN_Init

COMMN_ErrorHandling

Dataflow contained in one job only <project_name>_<job_name>_[XT|TF|LD|AG…]_<dataflow_name> BI4B_D_LD_Opportunities
Dataflow that is reused <project_name>_COMMN_[XT|TF|LD|AG…]_<dataflow_name>

BI4B_COMMN_LD_JobCycles

COMMN_LD_Jobs

Embedded Dataflow <project_name>_<job_name>_[XT|TF|LD|AG…]_<dataflow_name>_EMB BI4B_D_LD_Employees_EMB
ABAP Dataflow <project_name>_<job_name>_XT_<dataflow_name>_ABAP BI4B_D_XT_KNA1_ABAP
Custom Function contained in one job only <project_name>_<function_name> BI4B_getDate
Custom Function that is reused COMMN_<function_name> COMMN_dateKey

1.1. Projects: <project_name>

Give every DS project a 5-character short name. The name has to be short, because it will be used as a prefix for the name of all reusable objects defined within the project.

E.g.: P2345

1.2. Jobs: <project_name>_<job_name>

Give every job a 5-character short name. Use < project name>_ as a prefix for the job name. The name has to be short, because it will be used as a prefix for the name of all workflows and dataflows defined within that job.

E.g.: P2345_J2345

1.3. Workflows: <project_name>_<job_name>_[XT|TF|LD|AG…][_<workflow_name>]

Name every workflow with <project_name>_<job name>_ as a prefix. Use COMMN_ as prefix for shared workflows, used across projects, <project_name>_COMMN_ when used in multiple jobs within a given project.

Workflows are often used to group dataflows for serial or parallel execution. In a typical ETL job, dataflows are executed in “stages”: a first set of dataflows have be executed (in parallel) before a next set can be started; and so on. A data warehouse loading job may extract data from the sources, load them into staging, optionally transform from staging-in to staging-out before loading into the core EDW and aggregating into the semantic layer.

Distinguish between job stages by extending the prefix with a 2 character code:

  • XT: extract from source to staging
  • TF: transform from staging-in to staging-out
  • LD: load from staging into the core EDW layer
  • AG: load (physically aggregate) from core to semantic layer

The workflow name will be used as a prefix for the name of all embedded workflows and dataflows.

E.g.: P2345_J2345_XT

Within a workflow, objects (scripts, sub-workflows, dataflows) must either all be defined in parallel or all sequentially, and will be executed as such. There is no limit to the number of objects within a workflow. When the number of objects is higher than the number of processors available, DS will internally control the execution order of embedded parallel objects. Only when there are fewer objects than the number of processors available, they will really be executed in parallel.

Complex hierarchical structures can be defined by nesting workflows. There is no limit to the number of nesting levels, either. With nested workflows, use a name (_Facts for facts extraction or load, _Dims for dimension processing…) combined with an outline numbering scheme (1, 11, 111, 112, 12, 2…).

E.g.: P2345_J2345_LD_Dims21

Some workflows may not contain dataflows at all; they only contain not reusable objects. In that case, just name the workflow according to its function.

E.g. for a workflow embedding an initialization script: P2345_J2345_Initialise

1.4. Dataflows

DS supports three types of dataflows. The dataflow names must be unique across the different types. To distinguish the embedded and ABAP dataflows from the regular ones, use a suffix in their name.

  • Regular dataflows: <project_name>_<job_name>_[XT|TF|LD|AG…]_<dataflow_name>

According to design and development best practices there should only be a single target table in a dataflow. Name a dataflow according to that target table.

Use <project_name>_<job name>_ as a prefix. Use COMMN_ as prefix for shared dataflows, used across projects, <project_name>_COMMN_ when used in multiple  jobs within a given project. Distinguish between dataflow locations (extract, transform, load, aggregate…) by extending the prefix with a 3 character code (XT_, TF_, LD_, AG_…) as from the embedding workflow.

E.g.: P2345_J2345_XT_S_TABLE1, P2345_J2345_LD_TargetTable

  • Embedded dataflows: <project_name>_<job_name>_[XT|TF|LD|AG…]_<dataflow_name>_EMB

Name every embedded dataflow with <project_name>_<job name>_ as a prefix; use _EMB as a suffix for the dataflow name. Distinguish between dataflow locations (extract, transform, load and aggregate) by extending the prefix with a 3 character code (XT_, TF_, LD_ and AG_).

E.g.: P2345_J2345_LD_TargetTable_EMB

  

  • ABAP dataflows: <project_name>_<job_name>_XT_<dataflow_name>_ABAP

An ABAP dataflow is always used as a source in a regular dataflow. Reuse that name for the  ABAP dataflow and add _ABAP as a suffix to make it unique.

E.g.: P2345_J2345_XT_S_TABLE1_ABAP

1.5. Custom Functions: <project_name>_<function_name>

Give every Custom Function a descriptive name. Use <project_name>_<job name>_ as a prefix. Use COMMN_ as prefix for shared custom functions, used across projects.

E.g.: P2345_TrimBlanksExt

2.  Datastores: [SAP|BWS|BWT|HANA…]_<datastore_name>

As datastores are often used in multiple projects, they do not follow the same naming conventions as for other reusable projects.

Name a datastore in line with its physical name, and make the prefix depend on the object’s type:

Datastore Type Database Type Naming Convention Example
Database SQL Server SQL_ SQL_OC4A1
Database Oracle ORA_ ORA_ITSD
Database Teradata TD_ TD_Staging
Database DB2 DB2_ DB2_MDM
Database MySQL MySQL_ MySQL_GEB
Database Sybase ASE ASE_ ASE_CRN
Database Sybase IQ IQ_ IQ_CMDWH
SAP Applications SAP_ SAP_MDR
SAP BW as a source BWS_ BWS_Achme
BW as a target BWT_ BWT_Hana
SAP Hana HANA_ HANA_DB
Adapter AS_ AS_Nexus
Web Services WS_ WS_Weather

Note 1: Pay attention when choosing datastore names. A datastore name cannot be changed anymore once the object has been created.

Note 2: Landscape-related information should not be handled with datastore names. So, never include a physical system indicator (DEV, T, QA…) in a datastore name. Landscape information should be configured using datastore configurations. Create one datastore, then create a datastore configuration for every tier (development, test, QA, production…) in the landscape.

 

3.  File formats: <project_name>_<file_format_name>

Reuse the file name for the format name of a project-specific file. Use < project name>_ as a prefix.

E.g.: P2345_ISO_Cntr_Codes

Note: Pay attention when choosing a file format names. A file format name cannot be changed anymore once the object has been created.

4.  Not reusable objects

Because not reusable objects are only defined within the context of a workflow or a dataflow, no strict naming standards are necessary. Names will only serve documentation purposes.

Use meaningful names for workflow objects (Script, Condition, While Loop, Try, Catch).

Do not change the transform names unless you want to stress the specific purpose of a Query transform, e.g. Join, Order, OuterJoin…

To report this post you need to login first.

11 Comments

You must be Logged on to comment or reply to a post.

  1. Jay Roble

    Nice Post. I think object naming standards are very important & makes support much easier.

    I was wondering why you didn’t follow the a Data Services Designer Guide section 4.2.2 Naming conventions for objects in jobs using prefix (JOB_, RTJOB_, DF_, WF_, etc.)?

    We use the prefix to help quickly ID what type of object it is, helps in the error logs. In addition to those we also use PJ_ = project, DS_  = DataStore, FF_ = file format, etc.

    We also include an “DS Application Area” (development/support area) of who owns the objects:

    PJ_SALES_PROJ1
    JOB_SALES_PROJ1_LOAD_T100

    etc.

    DS_SALES_SAP_ECC = Sales Area SAP ECC data source.

    DS_SALES_SAP_BW = Sales Area BW data source.

    DS_SALES_ORA_PROMO = Sales Area ORACLE Promotion Data Base.

    Thanks again, we could use more of these best practices.

    (0) 
    1. Dirk Venken Post author

      I don’t, because I find prefixes annoying and irritating 🙂 , especially when I am scrolling thru a long list of dataflow names. Let’s do a little test: how long does it take you to find out whether the two strings NBVCXWMLKJHGFDSQAZERTYUIOP and N8VCXWMLKJHGFDSQAZERTYUIOPare different? I bet no time at all! You see that right away. At least I do 🙂 . But what about NBVCXWMLKJHGFDSQAZERTYUIOPand NBVCXWMLKJHGFDSQAZERTYU1OP? It takes me a tiny bit longer. Because just as everybody using latin, cyrillic… script, I am reading left to right. And the further away to the right I find the difference, the longer it takes.

      Also, I don’t need the prefix, because I can tell from an object’s icon, from its context, from its content… what type it is. Even if a dataflow and a workflow of mine would have exactly the same name, I am able to distinguish between them in the logs, because DS nicely prints the tracing level.

      And I don’t, in analogy with what I and (most probably) you do at database level. How do you call your database table where you keep track of client information? Customer? Customers? KNA1? (KN is an abbreviation of Kunde, the German word for Customer.) Most probably something like that. I for sure have never called it Table_Customer. Not even T_Customer. No prefixes in my db? No prefixes in my DS! It’s as simple as that. By the way, that’s why I don’t feel the need for suffixes either (though I might have called one of my very first tables, long long time ago, EMP_TABLE or something like that – I definitely don’t do that anymore).

      All that said, naming standards are a matter of personal preference. And circumstances. Take my datastores. I use a prefix according to its type. That’s because as an SAP consultant I hop from customer to customer, dealing with all sorts of data sources and targets. And the prefix helps me to identify quickly. But suppose I’d work in a 100% MS shop, all my databases would be SQL Server and then there would be no added value from the prefix. I am sure I would adopt a different naming scheme then, that would perfectly serve its purpose, too.

      (0) 
  2. Néstor Martínez

    “There is no limit to the number of objects within a workflow.”

    Is true, theorically. In the practice, it’s limited by the amount of RAM memory you could use for it.

    regards,

    (0) 
  3. Dirk Venken Post author

    For documentation and legibility purposes, we recently started naming our projects <project_name>_<project_description>, example: BI4B_BusinesIntelligenceForBelux. No other standards affected.

    (0) 
    1. Dirk Venken Post author

      No I don’t. The best I have come up with is what you can see in my examples, I use _D suffix for a daily job, _W for weekly, _M for monthly.

      You could extend this by adding different frequencies (_H for an hourly job, _H6 for a job that runs every six hours…) and extra suffixes specifying start time, job dependency and so on.

      (0) 
      1. Chaitanya Kasarkod

        Dirk,

        Do you follow any naming conventions for staging tables? I use the following and find it easy to navigate and find the tables I am looking for.

        STG_[SOURCE_SYSTEM_NAME]_[SCHEMA_NAME]_TABLENAME.


        Although, this can cause pain when Oracle is your staging database because of their 30 character limit on table names.

        (0) 
        1. Dirk Venken Post author

          For staging-in, I do something very similar: I don’t use a prefix (you should know by now I am a bit allergic to them) but I define a schema per source system/schema combination and use exactly the same table names as in the source database.

          For staging-out I typically adapt to the database naming standards in force, if any.

          (0) 
  4. Pedro Soares

    Hi, just wanted to hop in the discussion even if I get here 2,5 years later. Dirk approach is for me the latest trend as it follows the same principle as databases naming conventions nowadays of not using prefixes. Even if I dislike a lot repeating the project name, it’s useful because of the no folder concept. Good post 🙂 and remember, having a naming convention is always better than having none 😉

    (0) 

Leave a Reply