When you are executing SQL statements to find which one is the best, you need to take into consideration the factors that can skew the accuracy of the test results. These factors include caching the data, indexes, and SQL statement; other activities on the CPU; and network traffic. Quest SQL Optimizer provides option settings that allow you to minimize these factors so that you can find the SQL statement that is best for your database environment. This tip discusses minimizing the effect of caching in order to have the most accurate run time comparisons. The next two tips will discuss how to minimize the effect of other activities on the CPU and network traffic.
When you are testing the SQL statements, it is best to compare how long it takes them to execute on the CPU. The elapsed run time in Quest SQL Optimizer is calculated from the time the SQL statement starts to run on the CPU until the time it finishes. Consequently, it is best to try to eliminate other factors that may add to the execution time.
Caching the Data, Indexes, and SQL statements
Caching the data, the indexes, and the SQL statements can affect the comparison times if one SQL statement caches and others do not. The following options are available for you to retrieve the most accurate run time comparisons.
Option 1: Run all SQL twice if original SQL runs faster than (seconds)
The Run all SQL twice if original SQL runs faster than (seconds) option gives the best overall results. It executes the original SQL and the SQL alternatives twice to make sure that the data, indexes, and SQL statements are cached only if the original SQL statement runs faster than the time you specify. Caching is a larger percentage of the overall run time for SQL statements that run quickly. This option is very important when you compare a SQL statement that uses caching and runs quickly to a SQL statement that does not use caching.
When it executes the SQL statements twice, it uses the second run time. If the original SQL statement runs longer that the time you specify, then all SQL statements are executed once, which is preferred for longer running SQL statements. This is the default setting, and it is really a combination of the Run all SQL twice using the second run time and all others once and the Run all SQL once options.
Option 2: Run original SQL twice using second run time and all others once
The first time you access data from a table, the data is cached into memory from the disk. This process takes a few moments. The next time you access that data, it is already in memory so accessing the data is faster for the following SQL statements. Therefore, the first SQL statement may have the additional time it takes to cache the data included in the run time whereas the following SQL statements do not have that additional time included in their run times. So to have a comparable test, the Run original SQL twice using second run time and all others once runs the original SQL twice. The time from the first execution of the original SQL is ignored and the time from the second run is compared to the time from the other statements (which are all run once). With this option, all SQL statements are executed with the data cached.
Option 3: Run all SQL twice using the second run time
For fast running SQL statements, the Run all SQL twice using the second run time option is a good choice as it executes all SQL statements twice. This enables you to eliminate two additional factors that can affect the accuracy of the comparison run times: caching, parsing, and optimizing the SQL statement and caching additional indexes used by the alternative SQL statements. If a SQL statement has been recently executed, then the SQL information for that statement is likely to be resident in the cache and the statement should execute a little faster because the SQL statement does not need to be parsed and optimized. Also, if some of the SQL statements use different indexes, one index may be resident in the cache and another may not. The additional time for caching an index will be included in the run time.
This option eliminates time variations caused by additional overheads since it runs all SQL statements twice. You get a more accurate comparison of the run time of each alternative SQL statement because you are comparing the actual time it takes to execute on the CPU. This setting is especially suitable for fast running SQL statements, such as SQL execution that takes less than 5 seconds.
Option 4: Run all SQL once
The Run all SQL once is the best selection for long running SQL. There is no need to run statements twice because the effect from additional overhead is relatively insignificant compared with the actual SQL execution time.
Start the discussion at forums.toadworld.com