Introduction
Execution of a SQL statement in an Oracle Parallel Server environment is slightly different then in a single instance environment. In an OPS environment additional instructions are executed to obtain and release PCM (Parallel Cache Management) locks on the data blocks accessed by the SQL. To provide an understanding of the processing that occurs, we give below a step-by-step description of the execution flow when a Query/DML statement is issued in an OPS environment. Here we consider two simple cases.
- PCM lock on the block(s) needed is readily available and no pinging takes place.
-
The block(s) needed are locked by other instances in an exclusive mode,
- Prior to Oracle8i disk based ping is done.
- Oracle8i onwards the holding instance sends the block to the requesting instance via the BSP process.
- Prior to Oracle8i disk based ping is done.
Case1: SELECT/DML with No Ping
In this example we explain a simple Select statement (with no update clause). The foreground process that executes the SQL statement on the user's behalf requests a SHARED lock on the accessed block(s). The execution of DML i.e. SQL with Update, Insert, Delete or Select for Update will be similar except that the lock mode requested by the foreground server process will be EXCLUSIVE mode instead of SHARED mode.
- Foreground Server process checks the status of the locks, which is maintained in the SGA, to see if the local instance has the resource covering the block it needs, locked in the SHARED mode. If so, then the server process updates the list of blocks present in the buffer cache covered by that lock (lock element buffer chain) then proceeds to read the block. If not, then it goes to step 2.
- Foreground server process requests a SHARED lock on the resource by sending a message to the LCK process.
- LCK will message the DLM to acquire the lock in SHARED mode.
- The DLM will modify the necessary structures in its own memory (i.e. resource and lock structures). This requires the DLM to perform directory lookup to identify the current node mastering the resource. If the mastering node is not local, then the remote mastering node will be messaged to obtain the lock in appropriate mode.
- The DLM returns an Acquisition Asynchronous System Trap (AAST) signal to LCK as a notification that the lock request has been granted. The case where the request cannot be granted immediately because another instance already holds a lock is described in the next section.
- LCK updates the local lock element structure in its SGA with the new lock mode for the particular resource covered by the lock element.
- LCK notifies the server process that the request has been granted.
- Server process reads the block from disk in the buffer cache and proceeds to execute the SQL statement.
Figure 1
Case 2(a): SELECT/DML with Ping (Pre Oracle8i i.e. without Cache Fusion)
The term "pinging" refers to exchange of blocks amongst instances. This occurs when a request is made for a block on one node while that block or some other block covered by the same DLM resource is held in the cache of some other node in a mode that is incompatible with requestor's mode.
In the example below (refer to figure 2), assume Node 1 has a resource locked in EXCLUSIVE mode. This resource covers several database blocks. Node 2 would like to acquire the same resource with a lock in SHARED mode to service a Select statement issued on that node which requires one or more of the blocks covered by the DLM lock. This will cause the blocks, which are dirty in the cache of node 1 to be written to disk before the lock request is granted to node 1.
- Foreground Server process checks the status of the locks, which is maintained in the SGA, to see if the local instance has the resource covering the block it needs, locked in the SHARED mode. If so, then the server process updates the list of blocks present in the buffer cache covered by that lock (lock element buffer chain) then proceeds to read the block. If not, then it goes to step 2.
- Foreground server process messages LCK on node 2 to acquire the lock on the resource structure in SHARED mode.
- LCK on node 2 calls the local DLM to acquire the lock in SHARED mode.
- The DLM on node 2 creates DLM structures in its local memory area.
- The DLM determines the Directory Tree Node, which manages the resource block. The Directory Tree Node maybe any node in the cluster.
- Directory Tree Node notifies the DLM on node 2 that node 1 is currently mastering the resource.
- The DLM on node 2 requests a SHARED lock on the resource from the DLM on node 1.
- Since the node 1 has the lock currently in EXCLUSIVE mode, the DLM sends a Blocking AST signal to LCK on node 1 indicating that another process wants a lock on the same resource block and the two modes are incompatible.
- LCK on node 1 goes through the lock element structure in the SGA and searches through all the cache buffers managed by the same resource block to see if there are any dirty buffers.
- LCK on node 1 messages its local DBWR to write all the dirty cache buffers to disk in the chain searched in step 9.
- DBWR on node 1 writes dirty buffers to disk.
- DBWR on node 1 notifies LCK that all dirty buffers have been written to disk.
- LCK on node 1 now calls the local DLM to down grade lock from EXCLUSIVE mode to SHARED lock mode. While downgrading, the current SCN of node 1 is written in the value block of the resource.
- The DLM does the downgrade and then grants a SHARED lock to the node 1.
- The DLM sends an Acquisition AST signal to the LCK on node 2 indicating that it has acquired the lock on the resource in SHARED mode.
- LCK on node 2 updates the lock element structures in its local SGA.
- LCK on node 2 notifies local server process waiting on that DLM resource lock that it can now read it from the disk.
- Server process reads the block from disk.
Figure 2
Block Cleanouts and Consistent Reads
Data blocks after being read from disk may require rollback segment (undo) information to be applied to them
- For performing cleanouts or
- For backing out uncommitted transactions, which the query is not, supposed to see based on its start time (Read Consistency).
The first case is made more efficient by setting an init.ora parameter DELAYED_LOGGING_BLOCK_CLEANOUTS = true, by default. With this, the block cleanout (marking the changed block with the commit time) is done as a part of current block change and thus the redo records of the block change and the cleanout changes are piggybacked. This also is a 'fast commit' method since no redo records will be generated for cleanout purposes when the block comes into SGA next time during a DML execution. The ping, down-convert and extra DLM processing on the transaction (rollback segment) header block of the instance, which did the previous changes, are also eliminated. Furthermore, only minimal work is carried out on the current dirty block as a part of cleanout process (changing cleanout SCN) so as to eliminate the above mentioned extra work. Hence, this feature is sometimes also called as 'no logging block cleanout'. Most of the blocks will be cleaned out in this way with the exception of blocks changed by long running transactions for consistent reads.
Thus, this feature is useful for short OLTP transactions, which are committed as soon as they are done. For long running queries typical to DSS environment, the dirty blocks will age out of the SGA without cleanout since the transaction will not have been committed.
Doing a consistent read may require reading the rollback segment of remote instances to obtain the information for read consistency. The blocks required may be just the segment header containing the transaction table or the undo blocks. Such requests for remote undo segment header and/or undo blocks cause pings to occur as the instance owning the rollback segment is forced to downgrade locks and possibly write the blocks to disk if dirty so that the requesting instance may have them. This WRITER/READER contention not only requires the needed data blocks to be pinged but also the related rollback segment and rollback segment header blocks. This has been optimized in Oracle8i via Cache Fusion.
Case 2(b): Select with Cache Fusion
In the example below (refer to figure 3), assume Node 1 has a resource locked in EXCLUSIVE mode. This resource covers several database blocks. Node 2 would like to acquire the same resource with a lock in SHARED mode to service a Select statement issued on that node which requires one or more of the blocks covered by the DLM lock.
- Foreground Server process checks the status of the locks, which is maintained in the SGA, to see if the local instance has the resource covering the block it needs, locked in the SHARED mode. If so, then the server process updates the list of blocks present in the buffer cache covered by that lock (lock element buffer chain) then proceeds to read the block. If not, then it goes to step 2.
- Foreground server process messages LCK on node 2 to acquire the lock on the resource structure in SHARED mode.
- LCK on node 2 calls the local DLM to acquire the lock in SHARED mode.
- The DLM on node 2 creates DLM structures in its local memory area.
- The DLM determines the Directory Tree Node, which manages the resource block. The Directory Tree Node maybe any node in the cluster.
- Directory Tree Node notifies the DLM on node 2 that node 1 is currently mastering the resource.
- The DLM on node 2 requests a SHARED lock on the resource from the DLM on node 1.
- Since the node 1 has the lock currently in EXCLUSIVE mode, the LMD processes on node 1 instructs the BSP process to make CR copy of the block and send it to node 2.
- BSP on node 1 makes the CR copy of the needed block and sends it over to the node 2 (requesting node).
- BSP on node 1 also intimates the DLM that sending a CR copy of the needed block to node 2 has satisfied the shared lock request from node 2 and the lock downgrade is not needed.
Figure 3
Start the discussion at forums.toadworld.com