Common Wait Events in a Real Application Clusters

Difference in single instance and RAC instance:

In a single database instance environment, there will be only one set of shared memory segments for the database. In other words, the buffer cache and shared pool are local to that particular instance; at any point in time the processes attached to that particular instance will be accessing only one set of memory structures.

However, in the RAC environment, multiple instances share the same database. The buffer cache is split among these multiple instances; each instance has its own buffer cache and other well-known SGA structures. It is possible for a buffer to be in the buffer cache of one of the instances on another node. The remote instance’s Lock Manager Service (LMS) processes will be accessing the global buffer cache, and the DBWR process will be accessing the local buffer cache.

Global Buffer Cache in Real Application Clusters

The data buffer cache is shared by more than one instance, and the buffer cache is called the global cache. Each instance has its own buffer cache local to that particular instance, and all of the buffer caches together create the global cache.

Global Cache waits:

The most common wait events in the RAC environment related to the global cache are:

1. Global cache cr request

2. Global cache busy

1. Global Cache cr Request:

When a process requires one or more blocks, Oracle first checks whether it has those (blocks) in its local cache. When a session requests a block(s) that was not found in its local cache, it will request that the resource master grant shared access to those blocks. If the blocks are in the remote cache, then the blocks are transferred using the interconnect to the local cache. The time waited to get the blocks from the remote cache is accounted in the global cache cr request wait event.

Global cache cr request waits are followed by the db file sequential/scattered read waits. Excessive waits for global cache cr request may be an indication of a slow interconnects.

You can use the oradebug ipc command to verify if the private network is used for cache transfer between instances.

SQL> oradebug setmypid

Statement processed.

SQL> oradebug ipc

Information written to trace file.

SQL> oradebug tracefile_name

/apps/oracle/asm/product/10.2.0/db_1/admin/RAC/udump/rac1_ora_4749.trc

SQL> host

[oracle@rac1 bin]$ cat /apps/oracle/asm/product/10.2.0/db_1/admin/RAC/udump/rac1

_ora_4749.trc  (trace file content the detail of the IPC configuration)

admno 0x2b360ab admport:

SSKGXPT 0xcc1df2c flags SSKGXPT_READPENDING     info for network 0

socket no 7     IP 192.168.1.134        UDP 27278

sflags SSKGXPT_UP

info for network 1

socket no 0     IP 0.0.0.0      UDP 0

sflags SSKGXPT_DOWN

active 0        actcnt 1

context timestamp 0

From the trace file, it is clear that the private network 192.168.1.134 is used for the cache fusion traffic and the protocol used is UDP.

Global Cache Statistics

The statistics related to global cache are available from the V$SYSSTAT view.

SQL> select name,value

2  from v$sysstat

3  where name like ‘gc%’;

NAME                                                                  VALUE

————————————————————— ———-

gcs messages sent                                                     7064

gc cr blocks served                                                   198

gc cr block build time                                               2

gc cr block flush time                                               38

gc cr block send time                                               17

gc current blocks served                                         3509

gc current block pin time                                        10

gc current block flush time                                      7

gc current block send time                                     174

gc cr blocks received                                               178

gc cr block receive time                                          357

gc current blocks received                                     118

gc current block receive time                                195

gc blocks lost                                                            0

gc claim blocks lost                                                  0

gc blocks corrupt                                                     0

gc CPU used by this session                                  2223

17 rows selected.

2. Global Cache Busy

In a cluster environment, each instance might be mastering its own set of data. Depending on the state of that buffer in the remote instance, the pin times may increase proportionally with the time it takes to service the block request.

If the buffer is busy at the cache level, you can use the following SQL to get the operations that are keeping the buffer busy.

SQL> select a.kcbwhdes “module”,
2   b.why0 “calls”,
3   b.why2 “waits”,
4  b.other_wait “caused waits”
5  from  x$kcbwh a,x$kcbsw b
6   where a.indx = b.indx
7  and b.other_wait > 0
8  order by b.other_wait;

module                         calls          waits      caused_waits
—————————————————————- —-
kdswh02: kdsgrp       53440      0              2

kdswh01: kdstgr        4402        0              2

Common Operations Causing buffer busy waits

Module: Definition

Kdsgrp: Performs get row piece. Typically, row pieces are affected only in case of chained and migrated rows. Row chaining has to be analyzed and fixed.

Kdstgr: Performs full table scan get row. Rows are accessed by a full table scan. Check the number of FULL table scans

Kdiixs: Perform index range scan

About the Author Brijesh

I'm 10gR2 RAC Oracle certified professional, working from last 6 year in the field of database.

Leave a Comment:

5 comments
Add Your Reply