AWR reports (Part I) – 10 most important bits
- before/meantime/after reports – create 3 different reports which cover the timeframe during the problem was experienced and compare it with the timeframe before and after.
- duration of snapshot – choose a shorter timeframe to get a more precise report.
- retention time – set this to 42 days, though it must be checked if SYSAUX tablspace has enough freespace. Configure the retention period
- format – Choose html to be able to use the links, bookmarks inside the report.
For example in very processing-intensive SQL statements which are executed repeatedly, only read blocks from the buffer pool increases the hit rate of the buffer pool. After optimizing such statements the hit ratio decreased though performance improves.
Buffer Nowait – shows how often buffer cache were accessed with no wait time.
Buffer Hit – shows how often a requested block has been found in the buffer cache without requiring disk access.
Redo NoWait – shows if log_buffer size is set correctly. Preemptive redolog switches in Oracle 11.2
Parse CPU to Parse Elapsd – shows how much time was spent on parsing while waiting for resources.
Non-Parse CPU – in the following example the the figure is close to 100% meaning that the overall CPU usage is only 0.15 % for statement parsing.
5 Shared Pool Statistics – shows if there is an overhead on the system regarding shared pool.
The values should not be very high (preferably less than 75%).
6 Top 5 Timed Foreground Events – Shows top 5 wait events that are taking the most of time. The exact meaning of each and every event can be found here.
Based on what we found in here 2 other sections in the report must be checked.
- SQL statistics section – check if there is a lot of read and which SQL statements involved.
- IO stats section – check if there is I/O bottleneck.
In the following example a long time (70%) is spent for waiting on I/O related reads. There are 12,727 waits in 265 seconds (4 minutes) which is more significant than 3 million waits in 36,388 second (10 hours).
As well only 8% of the time has been spent on DB CPU. If this time is significant check the following points:
- Is the CPUT times(s), (in our example 5119 seconds) significant compared to the total CPU time?
Total CPU time = number of CPU(s) * snapshot time (in seconds). find NUM_CPUS in the “Operating System Statistics” section.
- Is there a SQL statement which takes most of the time, if yes check “SQL ordered by CPU Time” section.
7 SQL ordered by Elapsed Time – Shows which SQL statement runs for a longer time. Those statement needs to be focused that have less numbers of execution with high Elapsed Time per Exec (s). More over check if the % total is significant.
8 SQL ordered by CPU Time – contains information on which SQL statement takes the most CPU time
Total DB time
9 &10 IO Stats – Check if read and write of the datafiles /logs is taking longer. Section 6 reveals some information about the waits for I/O, wether this is a good number or not is dependent on the Hardware/OS. If the I/O is slow check “Tablespace IO Stats” and “File IO stats”. “Rd(ms)” columns must not exceed 20 otherwise it worths involving your OS team and hardware vendor to investigate the I/O bottleneck.
thanks for this helpfull posting.
I think there is a little typo in chapter 6: "There are 12 million waits in 265". I think there are only 12 thousands waits.
ratio based tuning like "the target must be close to 100%". I thought that we overcame this round about 20 years ago (especially in such a high tech company like SAP).
"Check the "Av Rd(ms)" columns it must not exceed 20 otherwise involve your OS team and hardware vendor to investigate the I/O bottleneck"
In your case there are 3.45 average blocks per read, which means that you perform some multi block I/Os that are probably much larger than that. Everybody would be happy to get that average response time with mixed multi block I/Os (e.g. up to 1 MB for most OS platforms). Just look at your second highlighted example of 21.04 ms in average - the average single block I/O performance is 2.16 ms, which is absolutely perfect for disk based storage (and even the 8.09 ms from the previous one are ok).
I wanted to wait one day before replying but I see that Stefan has done it better than I would have done.
I recommend to read the following "recent" to get a better understanding of "ratios"
and the following video with Graham Wood for a little background about AWR.
BTW, I do not understand the following:
An snapshot, by default, is 1 hour, so 😕
Hi all, thanks a lot for your comments, I have read them and changed the article slightly. Please let me know your opinions again.