Skip to Content
Author's profile photo Bala Prabahar

Big Data – Chicago Inside Track 2011

When: July 15th

Details here: 




On July 15th, I would be discussing implementation details on how to partition a DB table. A majority of what I’ll discuss can be applied to any database that supports table level partitioning. I’ll use Oracle example however.

The diagram below shows the steps involved in partitioning large tables. 

Step 1: Identify Tables

Factors to be considered: Table Size, Performance Issues, Indexing Versus Partitioning, Global Versus Local indexes, Index only partitions, Archiving Strategy

Step 2: Discuss with Business

After identifying a list of tables to be partitioned, discuss with business users to come up with an idea on what columns would be good candidates for partitioning. 

Step 3: Determine Partition Key

Based on discussions with business users and analysis of tables, identify suitable Partition Key for each table.

Step 4: Run SQL Statement

In order to create partitions of equal size, run SQL statements to understand the data distribution for the partition key identified in Step 3.

Step 5: Determine the number of partitions

Depending on archiving strategy, business requirements, SAN Configuration, performance issues, determine either the number of records per partition and/or number of partitions you would like to create.

Step 6: Determine Upper Key Limit

Run a script to determine Upper Key Limits for each partition(Range Based Partitioning).

Step 7: Generate Create Table Statement (DB Specific)

This depends on DB you’re using. SE14 can be used; however there’re SAP specific constraints you need to consider.

Step 8: Load Data

There are several options to accomplish this step: Data Pump, CTAS etc.

Finally answer to what is common between two diagrams I posted in my blog titled “Let us discuss Big Data in Chicago Inside Track 2011!”:

First diagram shows a large database/table split into multiple smaller pieces. Second diagram shows US map containing states. And what is common between them: They’re not managed as one large object.  USA is divided into 50 states. Each state has its own government and each state is divided into multiple counties/cities etc. This scheme of governance is not something new to just USA but generally followed by all countries. Similarly if large tables are divided/partitioned, it would not only help improve the performance but also would be easier to manage (Operations such as drop partition instead of delete during archiving would be quicker). 

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.