Data Services Best Practices Code Management
Multiple Profiles, not Multiple Objects
When naming objects in a multi-user development environment, a developer can be tempted to give names as distinctive as possible. While it is important to avoid name clashes in a team environment, some objects should be given generic names to avoid difficulties when merging development from multiple users or migrating to a different environment. Data store is such an object. For example, developer Joe and Jane work on the same source database. If Joe names his data store “DS_Joe_Source”, and Jane names her data store “DS_Jane_Source”, there will be two data store objects in the central repository pointing to the exactly same database source. Another example is if a developer names a data store DS_DEV_Source, the name has to be changed to avoid confusion when migrating to the test environment.
Data Services uses profiles to solve this kind of problems. An object can have multiple profiles attached to it. Let’s revisit the above examples using profiles. In the repository, there is only one data store object created, DS_Source. DS_Source has multiple profiles defined on it, such as profile_DEV_Joe, profile_DEV_Jane, profile_TEST, etc. Applying a profile will quickly change its configuration.
Global Variables and Substitution Parameters
Global variables are variables defined at the job level. They can be used as place holders for external configurable parameters passed into the job such as directory path, default values, etc. Their values can be read and modified anywhere in the job. However, they are not visible outside the job.
To define a variable that can be shared by multiple jobs, one must use a substitution parameter. Substitution parameters are defined at the local repository level and thus are available to all jobs in the same repository. You should not modify the pre-defined DS parameters for your specific application since this may impact all jobs using those parameters.
Try to create a local repository for each developer. The overhead with a single repository is pretty minimal given the capacity of modern database systems and disk size. Define a connection to a central repository for each local one. Developers should not log into the central repository for any development work. Code should be developed and unit tested in the local repository and checked into the central repository.
A separate local repository should be created for each testing or production Data Services environment. Moving code into such a repository should be coordinated by a Data Services administrator or team lead, ideally after a code review process to minimize the chance of breaking existing code.
Check Out Objects with Filtering
When checking out, use the option “with filtering”. This give you the control of what objects you want to include in the check out. For example, if you already configured datastores in your local repository, you probably don’t want them replaced with wrong configuration. If you check out some objects that you do not intend to, you can always undo the checkout on those objects.
Label Objects with Meaningful Text
Labeling is optional but it is critical in code migration. Without a label, it would be very difficult to find the correct versions of all objects ready to be deployed into production because they are not necessarily always the most recent version. The label for a release should include the project name, release number, and a date stamp.
When you are still developing code, it is up to you to add a label or not. However, it is a good practice to add a meaningful description to the objects when there is a major change. For example, if you just add a new feature and unit test it, you can add a label with something like “New feature xxx added on 10-04-2015”.