The most likely problems that you will experience with replication are those with job queues. This topic addresses the following areas associated with running replication jobs:
- Detecting a Problem with a Job
- SNP Processes
- Job Queues
- Troubleshooting Job Execution Problems
- Viewing Job Queue Information
Detecting a Problem with a Job
To detect if a replication job has not executed as expected, check the following in the DBA_JOBS view:
- Check the NEXT_DATE value to ensure that the job was properly scheduled for execution.
- Check the FAILURES and BROKEN values.
If the job failed to execute, check the alert log and trace files for error information and fix the error.
If the job was not broken, it will ultimately be re-executed.
If the job was broken, or if you want to force immediate re-execution of the job, call DBMS_JOB.RUN after fixing the job.
If the job has never executed, there may be a problem with the availability of background processes. Check the initialization parameter JOB_QUEUE_PROCESSES to determine the maximum number of background processes available and JOB_QUEUE_INTERVAL to determine how frequently each background processes wakes up. The DBA_JOBS_RUNNING view describes what jobs these processes are currently running (you may have a problem with a runaway job), and the alert log and trace file can provide you with additional information about potential problems with the background process.
SNP Processes
SNP Background Processes
SNP background processes are used to execute the job queues. The SNP processes periodically wake up and execute any queued jobs that are due to be run. You must have at least one SNP process running to execute your queued jobs in the background. If an SNP process fails for whatever reason, Oracle restarts it.
Multiple SNP Processes
An instance can have up to thirty-six SNP processes, named SNP0 to SNP9 and SNPA to SNPZ. If an instance has multiple SNP processes, the task of executing queued jobs can be shared across these processes, thus improving performance. Note, however, that each job is run at any point in time by only one process. A single job cannot be shared simultaneously by multiple SNP processes.
Starting Up SNP Processes
The job queue initialization parameters allow you to control the operation of the SNP background processes. Set these parameters in the initialization parameter file for an instance. They take effect the next time you start the instance. The init.ora parameters affected are:
- JOB_QUEUE_PROCESSES
- JOB_QUEUE_INTERVAL
Job Queues
You can schedule PL/SQL routines to be performed periodically using job queues. To schedule the job, you submit it to the job queue and specify the frequency at which the job is to be run. You can alter, disable, or delete jobs you have submitted.
Using Job Queues
To schedule and manage jobs in the job queue, use the procedures in the DBMS_JOB package. The DBMS_JOB package contains the following procedures:
Procedure | Description |
---|---|
SUBMIT | Submits a job to the job queue. |
REMOVE | Removes specified job from the job queue |
CHANGE | Alters a specified job. You can alter the job description, the time at which the job will be run, or the interval between executions of the job. |
WHAT | Alters the job description for a specified job |
NEXT_DATE | Alters the next execution time for a specified job |
INTERVAL | Alters the interval between executions for a specified job. |
BROKEN | Disables or enables job execution. If a job is marked as broken, Oracle does not attempt to execute it. |
RUN | Forces a specified job to run |
Submitting a Job to the Job Queue
Use the SUBMIT procedure in the DBMS_JOB package to submit a job to the job queue. The SUBMIT procedure returns the number of the job you submitted.
Submitting a new job to the job queue example
The job calls the procedure DBMS_DDL.ANALYZE_OBJECT to generate optimizer statistics for the table VALERIE.ACCOUNTS. The statistics are based on a sample of half the rows of the ACCOUNTS table. The job is run every 24 hours.
VARIABLE jobno number; BEGIN DBMS_JOB.SUBMIT(:jobno, 'dbms_ddl.analyze_object(''TABLE'', ''VALERIE'', ''ACCOUNTS'', ''ESTIMATE'', NULL, 50);', SYSDATE, 'SYSDATE + 1'); COMMIT; END; /
Owner of a Job
When a user submits a job to the job queue, Oracle specifies that user as the owner of the job. Only a job's owner can alter the job, force the job to run, or remove the job from the queue. Ideally, the owner of the job will be the replication administrator, however, if it is necessary that a user submit the job, the user must have the appropriate privileges at the remote site.
In a deferred transaction, note that, unless explicitly specified otherwise, the user who initiated a transaction will determine which database link is used. The owner can be specified for a deferred transaction by using the DEFER_SYS.EXECUTE parameter, execute_as_user.
Job Numbers
A queued job is identified by its job number. When you submit a job, its job number is automatically generated from the sequence SYS.JOBSEQ.
Once a job is assigned a job number, that number does not change. Even if the job is exported and imported, its job number remains the same.
Job Definitions
The job definition is the PL/SQL code specified in the WHAT parameter of the SUBMIT procedure.
job | IN | The number of the current job. |
next_date | IN/OUT | The date of the next execution of the job. |
broken | IN/OUT | Status of job, broken or not broken. The IN value is FALSE. |
Job Execution Interval
The INTERVAL date is evaluated prior to job commencing. If the job completes successfully, the date calculated from INTERVAL becomes the new NEXT_DATE. If the INTERVAL date function evaluates to NULL and the job completes successfully, the job is deleted from the queue.
If a job should be executed periodically at a set interval, use a date expression similar to 'SYSDATE + 7' in the INTERVAL parameter. For example, if you set the execution interval to 'SYSDATE + 7' on Monday, but for some reason (such as a network failure) the job is not executed until Thursday, 'SYSDATE + 7' now evaluates to every Thursday, not Monday.
How Jobs Execute
SNP background processes are used by Oracle to execute jobs. If you force a job to run using DBMS_JOB.RUN, the job is run by your user process with your assigned privileges.
Job Queue Locks
Job queue locks exists to ensure that a job is executed by one session at a time. When a job is being run, its session acquires a job queue (JQ) lock for that job.
In the query below, the identifier for the session holding the lock is 19. The ID1 lock identifier is always 0 for JQ locks. The ID2 lock identifier is the job number of the job the session is running.
SELECT sid, type, id1, id2 FROM v$lock WHERE type = 'JQ'; SID TY ID1 ID2 ---------- -- ---------- ---------- 19 JQ 0 11211
Database Links and Jobs
If you submit a job that uses a database link, the link must include a username and password. Anonymous database links will not succeed.
Removing a Job From the Job Queue
If you need to remove a job from the job queue, use the REMOVE procedure in the DBMS_JOB package. For example:
DBMS_JOB.REMOVE(11211);
Altering a Job
You can alter a job in the job queue, by using the procedures CHANGE, WHAT, NEXT_DATE, or INTERVAL in the DBMS_JOB package.
For example, to change the interval to every 6 hours on a job, run the following:
DBMS_JOB.INTERVAL( job => 11211, interval => 'SYSDATE + 0.25');
Broken Jobs
A job can be either broken or not broken. Oracle will not attempt to run broken jobs, but, you can force a broken job to run by calling the procedure DBMS_JOB.RUN.
There are two ways a job can break:
- Oracle has failed to successfully execute the job after 16 attempts.
- You have marked the job as broken, using the procedure DBMS_JOB.BROKEN.
Use the procedure BROKEN in the DBMS_JOB package to mark a job as broken or not broken. For example:
DBMS_JOB.BROKEN( job => 11211, broken => FALSE, next_date => NEXT_DAY(SYSDATE, 'MONDAY'));
When you run a job that has been marked as broken and the job completes successfully, Oracle relabels the job as not broken. Oracle also resets its count of the number of failed executions for the job.
Forcing a Job to Be Executed
Occasionally you may need to manually execute a job. To do so, use the procedure RUN in the DBMS_JOB package.
The following example runs job 11211 in your session and recomputes the next execution date:
DBMS_JOB.RUN( job => 11211);
Troubleshooting Job Execution Problems
Several factors can prevent the successful execution of queued jobs; for example:
- not having any SNP background processes to run the job
- a network or instance failure
- an exception when executing the job
Note: If a job fails, Oracle rolls back the job’s current transaction (if any such transaction exists).
Recording Failures
When a job fails, information about the failure is recorded in a trace file and the alert log. Oracle writes message number ORA-12012 and includes the job number of the failed job.
Job Failure and Execution Times
When a job returns an error while executing, Oracle tries to execute it again. The first attempt is after one minute, the second attempt after two minutes, the third after four minutes, and so on, with the interval doubling between each attempt. When the retry interval exceeds the execution interval, Oracle continues to retry the job at the normal execution interval. However, if the job fails 16 times, Oracle automatically marks the job as broken and no longer tries to execute it.
Thus, if you can correct the problem that is preventing a job from running before the job has failed 16 times, Oracle will eventually run that job again.
Running a Job That Oracle Has Marked as Broken
If a job fails 16 times, Oracle marks the job as broken. Once you have fixed this problem, you can run the job by either
- forcing the job to run by calling DBMS_JOB.RUN, or
- marking the job as not broken by calling DBMS_JOB.BROKEN and waiting for Oracle to execute the job.
If you force the job to run by calling the procedure DBMS_JOB.RUN, Oracle runs the job immediately. If the job succeeds, then Oracle labels the job as not broken and resets its count of the number of failed executions for the job.
If you reset a job's broken flag (by calling either RUN or BROKEN), job execution resumes according to the scheduled execution intervals set for the job.
Viewing Job Information
The following data dictionary views display information about jobs in the job queue:
- DBA_JOBS
- DBA_JOBS_RUNNING
The following SQL lists the job number, next execution time, failures, and broken status for each job submitted:
SELECT job, next_date, next_sec, failures, broken FROM user_jobs;
The following SQL lists the session identifier, job number, user who submitted the job, and the start times for all jobs that are currently running.
SELECT sid, r.job, log_user, r.this_date, r.this_sec FROM dba_jobs_running r, dba_jobs j WHERE r.job = j.job;
Start the discussion at forums.toadworld.com