Skip to Content
This blog is in continuation to our blog on Identifying Bottlenecks, and is the second in a series of three blogs in the Replication Server Performance Tuning series.

 

Counter Data Analysis

After Identifying Bottlenecks, the next step in the Replication Server Performance Tuning process is to analyze the collected counter data to draw out an effective tuning plan. If counter data is too low, or too high, it implies that some settings or configurations in Replication Server require tuning.

During a normal run of Replication Server, counter data is collected in the form of monitoring data. With the help of this monitoring data, we can figure out what configurations or settings in Replication Server require tuning.

The primary function of counters is to identify the time distribution of components. For instance, counters tell you if a component is in a pending, busy, or wait state, if it is waiting upstream or downstream, or waiting for a memory resource. The eventual tuning plan is based around this discovery.

Analyzing Counter Data

An effective performance tuning plan is based on, and built around the time consumed or blocked by a bottleneck component. Some approaches to combat decreased performance are to increase parallelism, or the apply rate. You also need to be watchful of flow control. If the bottleneck component is frequently in a flow control state, check the flow control threshold, and increase it if it is set too low. For more information, see the Replication Server Administration Guide on SAP Help.

Different components have different counter data, and therefore, different configuration sets. From the performance tuning perspective, this implies that each component requires a specific tuning plan.

What is a component?

The replication path starts from the source (or primary) database, and ends in the target (or replicate) database. There are several internal Replication Server modules along the replication path. Each of these modules has a different set of functionalities, and take on different responsibilities to facilitate the movement of data from primary to replicate. These modules are considered components, and are referred to by different component names, such as Capture, Distributor, SQT, SQM, DSI, and so on.

Counters and Time Duration Settings

Replication Server categorizes counters according to the following time duration settings:

  • Busy time: Improves overall process efficiency (parallelism, read from cache), thereby reducing busy time.
  • Wait upstream time: Increases the flow control threshold, thereby reducing the overall flow control time. If this setting is high, you need to check the time distribution of the upstream component — whether it is busy, in a flow control state, or waiting for its own upstream.
  • Flow control time (wait downstream): Within the replication path, components can either be upstream or downstream. Components stop processing new data when the downstream is slow, and when the data cached in memory reaches a flow control threshold. The duration of time in this flow control state is the flow control time.
  • Memory control time: Increase the memory_limit or reduce the flow control threshold to avoid memory control, thereby reducing memory control time.

Using Collected Counter Data to Tune Replication Server

The primary function of counters is to identify the time distribution of components. For instance, counters tell you if a component is in a pending, busy, or wait state, if it is waiting upstream or downstream, or for a memory resource. The tuning plan is also based on this discovery.


EXEC Module

The following table describes important time counters in the Executor category:

Counter category Counter Description
Busy time RepAgentRecvTime The amount of time, in milli-seconds, spent receiving network packets or language commands.
RepAgentParseTime The amount of time, in milli-seconds, spent parsing commands.
RepAgentNrmTime The amount of time, in milli-seconds, spent normalizing commands.
RepAgentPackTime The amount of time, in milli-seconds, spent packing commands.
PRSNRMParseTime The amount of time, in milli-seconds, spent parsing commands by PRS threads.
PRSNRMNrmTime The amount of time, in milli-seconds, spent normalizing commands by NRM threads.
PRSNRMPackTime The amount of time, in milli-seconds, spent packing commands by Normalization (NRM) threads.
Memory control time
(wait time)
RAWaitMemTime The amount of time, in milli-seconds, the RepAgent thread spent waiting for memory usage under the memory control threshold.
Flow control time
(wait downstream)
RAWriteWaitsTime The amount of time, in milli-seconds, the RepAgent spent waiting for the SQM Writer thread to drain the number of outstanding write requests to get the number of outstanding bytes to be written under the threshold.

 

Capture Module

The following table lists important time counters in the Capture (CI) category:

Counter
category
Counter Description
Busy time PrsTotalTime The amount of time, in milli-seconds, a parser thread spent processing packages.
Memory control time CAPMemWaitTime The amount of time, in milli-seconds, Capture spent waiting memory usage under the memory control threshold.
Flow control time (wait downstream) WriteWaitsTime The amount of time, in milli-seconds, taken by the Message Delivery (MD) Module of the Distributor Thread (DIST) to wait for SQM writes.
Wait upstream time RecvTime The amount of time, in milli-seconds, spent receiving packages from the Client Interface (CI) stream.

 

SQT

The following table lists important time counters in the Stable Queue Transaction Manager (SQT) category:

Counter
category
Counter Description
Busy time SQTAddCacheTime The time taken by a Stable Queue Transaction Manager thread (or the thread running the SQT library functions) to add messages to the SQT cache, measure in milli-seconds.
SQTDelCacheTime The time taken by a SQT thread (or the thread running the SQT library functions) to delete messages from the SQT cache, measured in milli-seconds.
SQTParseTime The amount of time, in milli-seconds, spent by SQT in parsing commands, measured in milli-seconds.
SQTXactHashSearchTime The time taken by a SQT thread for searching transaction id in hash table, measured in milli-seconds.
SQTXactProfileTime The time taken by a SQT thread to do profiling of transactions, measured in milli-seconds.
Wait upstream time SQTReadSQMTime The time taken by a SQT thread (or the thread running the SQT library functions) to read messages from SQM. It includes the wait for upstream time (if no data is available), and the time to read the data out, measured in milli-seconds.

 

DIST

The following table lists important time counters in the Distributor (DIST) category:

Counter category Counter Description
Wait upstream time DISTReadTime
(parallel_dist off)
The amount of time taken by the Distributor Thread to read a command from SQT cache, measured in milli-seconds.
DISTSQTTranWaitsTime
(parallel_dist on)
The amount of time taken by the poll task of the Distributor Thread (DIST) to wait for SQT transaction ready, measured in milli-seconds.
Flow control time
(wait downstream)
DISTMDWriteWaitsTime The amount of time taken by the Message Delivery (MD) Module of the Distributor Thread (DIST) to wait for SQM writes, measured in milli-seconds.

 

DSI/S

The following table lists important time counters in the DSI – Scheduler (DSI/S) category:

Counter category Counter Description
Busy time DSILoadCacheTime Time spent by the DSI – Scheduler (DSI/S) in loading the SQT cache, measured in milli-seconds.
DSIThrdCmmtMsgTime Time spent in handling a Thread Commit message from its associated DSI/S threads, measured in milli-seconds.
DSIThrdSRlbkMsgTime Time spent by the DSI/S in handling a Thread Single Rollback message from its associated DSI/S threads, measured in milli-seconds.
DSIThrdRlbkMsgTime Time spent by the DSI/S in handling a ”Thread Rollback” message from its associated DSI/S threads, measured in milli-seconds.
Wait upstream time DSISqmMsgQWait Time spent by the DSI/S in handling a ”SQM notify Message Read Wait Time” message from its associated DSI/S threads, measured in milli-seconds.
Wait downstream time DSIDSIeMsgQWait Time spent by the DSI/S in handling a ”DSIe Message Read Wait Time” message from its associated DSI/S threads, measured in milli-seconds.
Memory control time DSIWaitMemTime Time spent by the DSI/S waiting for memory usage below the memory control threshold specified for DSI, measured in milli-seconds.

 

 

Up next!

The third and final blog in the Replication Server Performance Tuning series is: Modifying Memory-Related Configurations & Settings based on Counter Data Analysis.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply