When: July 15th
On July 15th, I would be discussing implementation details on how to partition a DB table. A majority of what I’ll discuss can be applied to any database that supports table level partitioning. I’ll use Oracle example however.
The diagram below shows the steps involved in partitioning large tables.
Step 1: Identify Tables
Factors to be considered: Table Size, Performance Issues, Indexing Versus Partitioning, Global Versus Local indexes, Index only partitions, Archiving Strategy
Step 2: Discuss with Business
After identifying a list of tables to be partitioned, discuss with business users to come up with an idea on what columns would be good candidates for partitioning.
Step 3: Determine Partition Key
Based on discussions with business users and analysis of tables, identify suitable Partition Key for each table.
Step 4: Run SQL Statement
In order to create partitions of equal size, run SQL statements to understand the data distribution for the partition key identified in Step 3.
Step 5: Determine the number of partitions
Depending on archiving strategy, business requirements, SAN Configuration, performance issues, determine either the number of records per partition and/or number of partitions you would like to create.
Step 6: Determine Upper Key Limit
Run a script to determine Upper Key Limits for each partition(Range Based Partitioning).
Step 7: Generate Create Table Statement (DB Specific)
This depends on DB you’re using. SE14 can be used; however there’re SAP specific constraints you need to consider.
Step 8: Load Data
There are several options to accomplish this step: Data Pump, CTAS etc.
Finally answer to what is common between two diagrams I posted in my blog titled “Let us discuss Big Data in Chicago Inside Track 2011!”:
First diagram shows a large database/table split into multiple smaller pieces. Second diagram shows US map containing states. And what is common between them: They’re not managed as one large object. USA is divided into 50 states. Each state has its own government and each state is divided into multiple counties/cities etc. This scheme of governance is not something new to just USA but generally followed by all countries. Similarly if large tables are divided/partitioned, it would not only help improve the performance but also would be easier to manage (Operations such as drop partition instead of delete during archiving would be quicker).