I often see look up functions is being used when performing value mapping. I see there are some disadvantages using the look up functions over joins

1.Visibility.When you review a job to fix or change the mapping rule, it hard to identify where the lookup has been used. If lookup is done using joins it’s easy for the programmers to locate the mapping.

2. Handling duplicate data .When there’s duplicate in the look up table, it’s not safe to use look up function, when do so, it simply return one of the value.Say you are looking for a new material type from the look up table, what happens if it contains two different new material type for an old material type? It returns one of the new material type based on the return policy you specified (Max/Min).When a join is used, and if a duplicate is found as given in the above scenario then both the values will be returned and it can be identified by looking at the result set.

3. Picking more than one column from the look up table. The value return by the look up function can be mapped to only one column. But a join can return more than one column and can be mapped to more than one column in the same query transform

4. Slower performance. There’s a greater possibility that a join can be pushed down rather than a look up function used within a query transform

To report this post you need to login first.


You must be Logged on to comment or reply to a post.

  1. Mohammad Ansari

    Magesh Subramanian ,

    Informative Blog, however, I find that usage of lookup functions has its own importance and is more flexible when it comes to simplify your code in cases you have no. of tables to join and apply some complex business logic inside. My understanding is that whatever we can not achieve through Joins, we should make use of Lookup functions.

    I could not understand the point 3 you have mentioned. Could you please explain more. As far as I know lookup_ext function can return multiple columns and can be mapped to more than one columns in output.

    1. Joel Gilbert

      The LOOKUP_EXT function can indeed return multiple columns from the lookup table. There is a difference in the dialog box depending on how the function is called.

      When the function is used directly in a mapping (e.g. Function wizard), then multiple return columns can be assigned to variables.

      If the function is activated via the New Function Call option, then multiple output columns are directly available.

      I will attempt to post a comparative screen shot later.

  2. Joel Gilbert

    Cache options:

    Joins offer the option to cache the “lookup” table or not (Automatic setting in Joins simply looks back to the Source Table editor in the data flow).

    Lookup functions offer an additional caching option – DEMAND_LOAD_CACHE.

    This option caches values from the Lookup Table one at a time. Very handy for scenarios where there is iterative use of lookup values and caching the whole table is not practical (and percentage of the lookup table that actually will be used is small enough to fit in cache, of course).


Leave a Reply