Skip to Content
  • Ensure that most of the dataflows are optimized. Maximize the push-down operations to the database as much as possible. You can check the optimized SQL using below option inside a dataflow. SQL should start with INSERT INTO……SELECT statements…..

/wp-content/uploads/2014/03/1_411131.png

  • Split complex logics in a single dataflow into multiple dataflows if possible. This would be much easier to maintain in future as well as most of the dataflows can be pushed down.
  • If full pushdown is not possible in a dataflow then enable Bulk Loader on the target table. Double click the target table to enable to bulk loader as shown in below diagram. Bulk loader is much faster than using direct load.

/wp-content/uploads/2014/03/2_411133.png 

  • Right click the Datastore. Select Edit and then go to Advanced Option and then Edit it. Change the Ifthenelse Support to ‘Yes’. Note that by default this is set to ‘No’ in BODS. This will push down all the decode and ifthenelse functions used in the Job.

/wp-content/uploads/2014/03/3_411134.png

  • Index Creation on Key Columns: If you are joining more than one tables then ensure that Tables have indexes created on the columns used in where clause. This drastically improves the performance. Define primary keys while creating the target tables in DS. In most of the databases indexes are created automatically if you define the keys in your Query Transforms. Therefore, define primary keys in query transforms itself when you first create the target table. This way you can avoid manual index creation on a table.


  • Select Distinct: In BODS ‘Select Distinct’ is not pushed down. This can be pushed down only in case you are checking the ‘Select Distinct’ option just before the target table. So if you require to use select distinct then use it in the last query transform.


  • Order By and Group By are not pushed down in BODS. This can be pushed down only in case you have single Query Transform in a dataflow.


  • Avoid data type conversions as it prevents full push down. Validate the dataflow and ensure there are no warnings.


  • Parallel Execution of Dataflows or WorkFlows: Ensure that workflows and dataflows are not executing in sequence unnecessarily. Make it parallel execution wherever possible.


  • Avoid parallel execution of Query Transforms in a dataflow as it prevents full pushdown. If same set of data required from a source table then use another instance of the same Table as source.


  • Join Rank: Assign higher Join Rank value to the larger table. Open the Query Editor where tables are joined. In below diagram second table has millions of records so have assigned higher join rank. Max number has higher join rank. This improves performance.

/wp-content/uploads/2014/03/5_411135.png

  • Database links and linked datastores: Create database links if you are using more than one database for source and target tables (multiple datastores) or in case using different database servers. You can refer my another article on how to create the DB Link.  Click URL


  • Use of Joining in place of Lookup Functions: Use Lookup table as a source table and set as an outer join in dataflow instead of using lookup functions. This technique has advantage over the lookup functions as it pushes the execution of the join down to the underlying database. Also, it is much easier to maintain the dataflow.


Hope this will be useful.

To report this post you need to login first.

14 Comments

You must be Logged on to comment or reply to a post.

      1. Oleg Luchinskiy

        where useful option (Ifthenelse Support), but my DataStore created on DataBase type (Sybase IQ and SQL Server), they have no such option 🙁 . I will test new DataStore with ODBC type and this options appeared.Why it is not available when you select a particular type of database?

        (0) 
        1. Oleg Luchinskiy

          with ODBC DataStore works fine! thanx!!!

          Much needed option, but the type of DataStore not change ..

          To use this have a feature to create a new DataStore and use it

          thanx for article

          (0) 
  1. Ravi Kashyap

    I am working on the oracle database and my datastore are configured with the same ..

    I am not able to see the ifthenelse support option..

    Any Idea.. !

    (0) 
  2. Pat Baker

    Distinct, order by and group by will push down to most databases and with multiple query transforms. This is one of the areas performance from the DB can be utilised and can even be part of a generated ‘INSERT into SELECT …’.

    The use of indexes can also be very bad for query performance as the database looks to do nested loop instead of hash joins and should be looked at in a case by case basis along with an experienced DBA. Maintaining indexes during loads is also a major killer of performance and in many cases dropping and rebuilding can dramatically increase throughput.

    (0) 
  3. Ankit Kumar

    Very helpful !

    Can you suggest any optimization techniques for a scenario where we are loading huge data from 2 tables ( 227354374 and 46526852 records) by merging them and we’re concerned about the performance issue that it might create??

    (0) 
    1. Mohammad Ansari Post author

      Ankit Kumar

      Don’t use merge transform. Have a permanent target table and load data from both tables using separate dataflows in parallel. Make sure both dataflows are fully push down.

      Hope this helps.

      (0) 
    2. Pat Baker

      If the source data is partitioned or can be logically partitioned then matching the DOP to the number of partitions and the number of loaders can increase the throughput with the objects in the dataflow replicating by the DOP. Any functions in the dataflow will also need to be enabled for parallel execution. 

      Be careful using bulk loaders with parallel dataflow as they can lock the table being loaded, eg SQLSvr, and often using multiple loaders will give higher throughput.

      Disabling indexing and logging on the target db and partitioning the target table can also greatly influence load throughput.

      Some time should be spent analysing which are the limiting operation.

      Regards

          Pat

      (0) 
  4. DILIP KUMAR

    Good stuff

    one dought

    When do we go for lookup_ext and when do we go for join.as well as how join give performance than lookup

     

     

     

    (0) 

Leave a Reply