– First Station: Know your tools.
The first stop in this performance journey is to know the tools that are available for you to perform every individual assessment. Each one of them will give you precise help in a particular area and will provide you with precious information that should be later used to make an initial conclusion and hence the desired change in the specific component. Once you are familiar with all the family members of the workload analysis and tuning tools the ASV process approach should be started. The following standard tools included in the NetWeaver platform are those I have chosen as my Top 10 list, and they are: 1. Workload Monitor (ST03N) 2. Tune Buffers Monitor (ST02) 3. Statistical Records (STAD) 4. SQL Trace Analysis (ST05) 5. Workprocess Overview (SM50) 6. Operating System Monitor (ST06) 7. Table Call Statistics (ST10) 8. Database Activity (ST04) 9. Profile Maintenance (RZ10) 10. CCMS Monitoring (RZ20) Lets start with a short description and brief functionality overview one by one:
1. Workload Monitor ST03/ST03N
In the picture above you can see the main screen of the workload monitor. In general, the Workload Overview (right panel) is mostly used as a starting point when carrying out the root cause analysis process of a bottleneck in the system. In this part of the tool, the different Task Types are explained in detail, such as the Number of Steps, Avg. Response Time, Avg. CPU Time, Avg. Database Time, Avg. Wait Time and more. The goal here is to try spot a particular task, the overall response time and if the current value for a particular period of time exceeds the allowed threshold (as a rule of thumb, for task type Dialog the Avg. Response Time should stand below 1000ms). Deeper investigation is needed in order to understand where the performance spike is located. Is it located in the database? Is it in the CPU? Is the Network affecting your response times? Sometimes you may also be asking other questions such as: – Is the program performing poorly? – Do we need to add more memory? – Do we need to add more work processes? If so, how many? – Do we need to change a parameter? If so, which one? These are only a few questions that we need to address and the Workload Monitor will help us drive them out!
2. Tune Buffers Monitor ST02
A buffer is a memory segment in which the data is temporarily stored. The buffer allows the information to be manipulated by processes more quickly and the main goal is to avoid that data be read from a slow medium like a disk drive. Instead, the information already located in the buffer is accessed much faster. In NetWeaver, there are several different buffers. Each one of them allows a specific type of data to be stored and the objective is to reduce the number of database accesses to a minimum. These buffers are individually placed locally in every Application Server and are implemented as shared memory segments in one or more shared memory pools depending on the operating system. These buffers are: – Program buffer: This buffer stores the compiled executable version of the ABAP programs, also known as program loads. – CUA buffer: This buffer stores menu data, buttons and related SAPGui functionality. – Screen buffer: This buffer stores the screens that are already generated. – Calendar buffer: This buffer stores the factory and user defined holiday calendars. – Generic key table buffer: This buffer stores table entries and can also store the entire table, which is then called full table buffering. – Single record key buffer: This buffer stores only a single entry for a particular table with its corresponding fields. – Export and Import buffer: This buffer is used to store data that needs to be available to several processes using the ABAP sentence EXPORT/IMPORT TO/FROM MEMORY in the ABAP program code. Others are the name table buffers which contain fields and table definitions that are active in the Data Dictionary. The name table is implemented in two different database tables; DDNTT for table definition entries and DDNTF for file description entries. The associated buffers are: – Table definition buffer: Memory segment for table DDNTT. – Field description buffer: Memory segment for table DDNTF. – Short NTAB buffer: This is a summary for Table and Field description buffers. – Initial records buffer: Depending on the field type it stores the layout. With the help of this monitoring tool, you will be able to tune all memory buffer parameters individually. Every single buffer is divided into two parts, the Buffer Size and Buffer Entries. Buffer Size: This is actually the size of the memory segment. By using the correct profile parameter, you will have the option to change this value for every buffer. This is also divided into allocated space and free space. Buffer Entries: The number of buffer entries controls how many objects can be stored in the buffer. You can have sufficient free space but if you run out of directory entries, new objects will not be placed in the buffer and the free space will not be used. The quality of a buffer and how often it is accessed is measured by the %Hit Ratio. This value will indicate if the information stored in the buffers, such as table entries, programs and screens, is being hit directly from the buffer itself or, on the other hand, if the system needs to bring that data from the database since it was not found in the buffer. The %Hit Ratio can have several values. For instance, when you start the system the %Hit Ratio will be below the recommended value until you have some activity in the system and the buffer starts to fill up with data. A good performing buffer will have a %Hit Ratio of 95% and above (99%-100% is excellent). Keep in mind though that a value lower to 95% not always shows that you have a problem. This can lead you to pinpoint and start performing an analysis. However, other factors can also affect buffer quality decreasing %Hit Ratio values. Another important piece is Buffer Swapping. This is a completely different story. When high swapping in a buffer occurs, performance is degraded. Since the information needed by a work process is read from the database and then put into the buffer, the old information that was previously in the buffer needs to be removed (swapped out) allowing the new information to come into it. There are two different factors that will play here, Buffer Size and Buffer Entries (well-known as Directory Entries). If one of them runs out of space, swapping occurs. We also need to keep in mind that some sort of swapping is sometimes normal and doesnt hurt the system. As a rule of thumb, you dont need to worry below 1000 swaps in a particular buffer. But also always check the %Free Space and %Free Directory Entries. Good values are up to 85% used space.
3. Statistical Records STAD
The statistical records collect information individually for each transaction step such as response times, database times, network times, wait times, front-end times and more, and store that data in a flat file at the operating system level know as the statistical file. This tool will help understand in detail where the performance spike is located by analyzing the transaction activity step by step. Information like how many database records were selected, updated or inserted and in which database tables (if activated), what program name was executed, what screen name and screen number was called and so forth. With the Statistical Records you will be able to understand when the problem is being observed for an averaged, high response time transaction. You will then know how to address that specific performance issue.
4. SQL Trace Analysis – ST05
My next tool in the top 10 list is the greatest SQL Trace Analysis. This is like magical medicine. Did you ever see thousands of programs indefinitely doing sequential reads in database tables over and over? Im almost sure the answer is yes! Well, with ST05, those long-running queries hitting the database and selecting millions and millions of useless records are nowadays an old story. You can trace all the activity for a user and for every program. The output will show the SQL statement, how many records it selects and is bringing from the database, the DECLARE, PREPARE, OPEN, REOPEN, CLOSE and FETCH operations that will be recorded during the trace so that later on when performing the analysis it will be of great help, the execution plans, index advising, sorting of similar statements or duplicated ones, sorting per tables and much more. I will tell you a story Last month I was working on a project. The functional team with the help of the development team was enhancing some R/3 functionality in a customer system and they were adding more information into the reports that some general managers guys used to work with every day. Those guys ran his reports every morning and in less than a minute the rich ALV output was shown on his laptop screen. Up to now, the management people were pretty happy. The development was ready, so one night IT decided to move the enhanced reporting program to production (it had been successfully tested so far in DEV and QA a few days earlier). The next morning I received more than 100 calls from management telling me that those report were taking forever to complete that morning and they asked what had changed. My first thought was, Its common, give it some time for the report to be completed, grab some coffee, have a little patience but that was only in my thoughts. So, I decided to start working on the issue. I picked up the favorite tool that always came to my mind, SM50 the Work Process Overview (the best of all) and I saw it 90.000 seconds Sequential Read over table BKPF, I hit F5 almost 100 times, and the counter was at 90.001, 90.002, 90.003 90.099, 90.100 and the Sequential Read over BKPF was there anyway. So the next step was to decide to execute my second best tool, ST05 SQL Trace Analysis. I called one of the general managers and I asked to execute the report again (I cancelled the old running report first) and activated SQL Trace for that user ID. The report was running again and when it got stuck reading BKPF sequentially, I let it run once more for another 5 minutes and then I stopped the trace. Now, heres where the story gets interesting. When I selected to display the contents of the SQL trace file, a popup message asked if I wanted to displaymore than the 20.000 entries from the trace file, and I thought to myself, Something is really wrong here. 20.000 entries in less than 5 minutes of trace? I selected Yes and it took about a minute to display the trace list and I saw it every entry was performing a FETCH over BKPF. I selected one row to display the SQL Statement and then I realized that the actual SQL command was first, selecting every row from BKPF (at that point the BKPF table had more than 45.000.000 rows), not using any valid index and whats worse was that it was in the WHERE clause. So, I pushed the button to jump directly to the ABAP code where the actual select statement was defined. The ABAP code selecting data from BKPF was changed and the WHERE clause was using sort of an internal table in a loop to get the results. Pretty bad. I called my friend the developer, Hey, come to my desk. I want to show you something. He said to me, Why? I said, Your user ID is under the last modification for that ABAP report. Then he reviewed the report with me and concluded to change the select statement to be now quite more selective. After we migrated the change back to production the problem was solved. The report now takes less than 30 seconds to complete. The management guys were happy again. That was the end of the story and all thanks to the SQL Trace.
5. Workprocess Overview – SM50
And now, lets discuss the Work Process Overview. I think this is the one top tool every administrator, developer and consultant needs to be familiar with and surely you are already familiar with. If this is not the case, let me introduce it a little bit. SM50 is the main process monitor. From this screen you will be able to see almost everything that is currently running in your NetWeaver system. You can also see detailed information on a particular running process, the developer trace and dispatcher trace and you can change trace level and component to perform a trace on. When you are working in a performance issue or even if you are analyzing something, SM50 will help. I will show you practical examples on the next trip. As a preview, from this screen you can see if the process is doing a Sequential or Direct read over a database table, what user is currently running what report, for how long it is running, etc. In the details screen, you can see information such as how many records were written, read, inserted or deleted, and the current SQL statement or procedure. Well, there is a lot to talk about, but I want to show you how to use this with practical examples, so stay tuned for Trip II.
6. Operating System Monitor – ST06
Another piece in the list is the OS Monitor. This application is responsible for providing all the operating system values related with CPU utilization, Disk drive information, Network, OS Swapping and others by means of the OSCollector (saposcol) service. With this tool, you can observe if for a particular drive the response time is excessively high or, on the other hand, if disk drives are performing well. I used to work with this tool in order to understand if a performance issue needed to be tackled from a hardware bottleneck perspective. The system is heavily paging? There is a rule of thumb. Paging will not be critical if, for instance, less than a 20%-30% of the main memory is being paged out. You can see the history for memory utilization and draw your own conclusions. If you are analyzing the database server, keep in mind that every request from other application servers will be handled by the database server hardware. In this case, if the system is performing poorly, this will cause poor response time in the whole system. For this reason it is a good practice to have CCMS alerts configured to monitor CPU utilization, Disk response times, and Memory paging. With the help of these three monitoring objects, you can have a real-time picture of what is going on at the hardware level in the system. The Disk monitor is also an important part of the OS Monitor. From the Disk monitor you can check every Disk or Logical Drive response times. This is particularly important in the database server since every database operation will impact on those response times. As a rule, if you have more than 50%-60% Disk drive utilization, start with a more in-depth analysis since overall system performance will be affected because of these slow drives.
7. Table Call Statistics – ST10
With this Table Call Statistics transaction, you will be able to see detailed information regarding the table and the table buffer status. In NetWeaver, there are several different buffers. In this application will work directly with the Table Buffers. As you should already know, when a table is buffered its contents are located in a memory segment in the shared memory pool locally under the application server and that table information is read much faster from the buffer. The overall goal is to reduce database accesses and disk times as much as possible. Read operation over a buffered table is around 80 times faster than accessing the table directly from the database. Do you remember we had already been talking about buffers before in this blog? Well, the tables that have buffering enabled are store in the Generic Key Table buffer and in the Single Record Key Table buffer. In the standard delivery of every NetWeaver component there are several tables that are already buffering enabled, but for a particular table, You can also define the buffer settings and whether or not you want to allow buffering. There is also a rule of thumb. You should enable buffering for a particular table if that table has more read operations than write operations. Otherwise, the table buffer will be invalidated because the write operation and the %Buffer Hit Ratio will be below the recommended value since the system needs to flush the buffer contents to the disk after the insert, delete or update operation and then it will need to populate the buffer again. From a performance perspective this is not quite feasible.
8. Database Activity – ST04
The Database Monitor shows specific information related to the current performance in the database interface. Almost everything going on in the database will be presented here. Data buffer allocation, Hit Ratio, DB Connections, CPU Times, Index utilization, Database files status and utilization and more. This tool is another key piece in the performance world. When analyzing database accesses, a good approach is to take a closer look at this transaction. Detailed table analysis can be performed through this. For a particular table, you can see the fragmentation level and if that table needs reorganization.
9. Profile Maintenance – RZ10
From the Profile Management screenwe will be able to change the system parameters. In our current case, we will change the related performance and I will show you how to do so with specific recommendations. The following is a list of the most common parameters we will work with during our journey: Program buffer abap/buffersize CUA buffer rsdb/cua/buffersize Screen buffer zcsa/presentation_buffer_area sap/bufdir_entries Generic key table buffer zcsa/table_buffer_area zcsa/db_max_buftab Single record table buffer rtbb/buffer_length rtbb/max_tables Export/import buffer rsdb/obj/buffersize rsdb/obj/max_objects rsdb/obj/large_object_size OTR buffer rsdb/otr/buffersize_kb rsdb/otr/max_objects Exp/Imp SHM buffer rsdb/esm/buffersize_kb rsdb/esm/max_objects rsdb/esm/large_object_size Table definition buffer rsdb/ntab/entrycount Field description buffer rsdb/ntab/ftabsize rsdb/ntab/entrycount Initial record buffer rsdb/ntab/irbdsize rsdb/ntab/entrycount Short nametab (NTAB) rsdb/ntab/sntabsize rsdb/ntab/entrycount Calendar buffer zcsa/calendar_area zcsa/calendar_ids Roll, extended and heap memory ztta/roll_area ztta/roll_first rdisp/ROLL_SHM rdisp/PG_SHM rdisp/PG_LOCAL em/initial_size_MB em/blocksize_KB em/address_space_MB ztta/roll_extension abap/heap_area_dia abap/heap_area_nondia abap/heap_area_total abap/heaplimit Workprocess Distribution rdisp/wp_no_dia rdisp/wp_no_btc rdisp/wp_no_vb rdisp/wp_no_vb2
10. CCMS Monitoring – RZ20
The CCMS Monitors will enable us to understand what is going on almost in real-time in a system for a monitored object. There are, as you can see in the picture above, several factory defined monitors that you can use. The monitor set All Contents on Local Application Server is a good one since when you activate it (double click on it) it will show you the entire monitoring context within the local application server. You also have the possibility to assign Auto-Reactions methods and own analysis methods to every monitored object. Later I will show you how to do that and what the best objects to monitor are. In the meantime, you can give it a try for yourself. Go to RZ20 and play with it! As a brief description, to configure the CCMS agents and Email alerts to be sent from a Central System (CEN) its necessary to complete the following tasks: Configure SMTP in the Central system (CEN) by enabling the built-in SMTP plug-in (as of BASIS release 6.x). This step will not be part of this blog. Please refer to OSS Note 455140 for installation instructions and additional information. You will need to install the CCMS agents in each monitored system to enable the alerts and emails. This is necessary because CEN alerts are handled by the remote CCMS agents locally in every satellite system. You should then configure the alerts in the CEN system by enabling the emails selecting the correct monitoring object and configuring email destinations properly. Dont worry! We will do this step by step in the upcoming trips. 😀
– Conclusions so far
We are at the end of our current trip. During this first journey I have shown you the Top 10 tools you will became familiar with throughout our trips. I will introduce the key parts of every one of them and tell you when you need to chose what. In the next trip, we will take each one, one by one, and with practical and real examples, we will learn which of them has helped optimize response times in one particular area. See you in the next upcoming trip. Stay tuned!
More blogs on SDN