23 Feb 2016

SQL Processing In Oracle

SQL Processing:-
To review, Oracle hard parsing reads in statistics, utilizes index information and creates an execution plan. Soft parsing already has the execution plan and doesn’t need to revisit the statistics and so on. The illustration below (taken from the Oracle Performance Tuning Guide) shows the steps the CBO (cost-based optimizer) takes to arrive at an execution plan.
Figure 2. How Oracle Arrives at an Explain Plan
This is the hard parse process.  At the end of this discussion, you will see why Oracle always looks to see if the SQL is already in the library cache first! (soft parse).
The CBO runs three to four SQL statements of its own during the hard parse process.  The SQL submitted by Oracle are called Dictionary SQL or Recursive SQL.  I’ll go with dictionary SQL for the remainder of this series.
These dictionary SQL statements are used to:
1.      Check the syntax of the table names and column names
2.      Pull in statistics
3.      Check permissions to all the objects involved
4.      Save the where clause predicates for future statistic runs
1.      The future statistics runs will produce histograms on these where clause items…another useful statistic that helps the CBO when there is skewed data…
The part in the dashed lines happens for both soft parsing and hard parsing.  The semantic checking, the permissions, and the lookup in the library cache.
The first thing in the hard parse is to do the query transformation part.  This is where the CBO can do subtle and not so subtle rewrites of your SQL.  Query transformation came to be in Oracle9 but it really has been around like forever because Oracle has always added functions to where clause items where the data types don’t match (to make them match…).  This can show up with the SQL not using available indexes because indexes are not used when there are functions on where clause items.
Oracle SQL is parsed before execution, and checked for syntax (and parts of the semantic check) before the SQL is loaded into the library cache.  As opposed to a soft parse (which does not require loading into the shared pool), a hard parse includes these steps:
  1. Loading into shared pool - The SQL source code is loaded into RAM for parsing. (the "hard" parse step)
  2. Syntax parse - Oracle parses the syntax to check for misspelled SQL keywords.
  3. Semantic parse - Oracle verifies all table & column names from the dictionary and checks to see if you are authorized to see the data.
  4. Query Transformation - Oracle will transform complex SQL into simpler, equivalent forms and replace aggregations with materialized views, as appropriate. In earlier releases of Oracle the query_rewrite=true parameter had to be set for materialized view rewriting. 
  5. Optimization - Oracle then creates an execution plan, based on your schema statistics (or maybe with statistics from dynamic sampling in 10g).  Oracle build the decision tree of costs during this period, choosing the path with the lowest perceived cost.
  6. Create executable - Oracle builds an executable file with native file calls to service the SQL query.
  7. Fetch rows - Oracle then executes the native calls to the data files to retrieve the rows and passes them back to the calling program.
Anytime a session issues SQL statement that does not already exist in the shared pool, then Oracle has to do a hard parse. Essentially performing all of the above steps.  If the statement already exists, then a soft parse occurs, skipping step 1. 
In a soft parse, Oracle must still perform a syntax parse and semantic check because it is possible that a DDL change altered one of the target tables or views since the SQL statement was originally executed.  In the case of DDL, all related SQL is marked as invalidated.
A hard parse is when your SQL must be re-loaded into the shared pool.  A hard parse is worse than a soft parse because of the overhead involved in shared pool RAM allocation and memory management.  Once loaded, the SQL must then be completely re-checked for syntax & semantics and an executable generated. 
Excessive hard parsing can occur when your shared_pool_size is too small (and reentrant SQL is paged out), or when you have non-reusable SQL statements without host variables.

The optimizer contains three main components, which are shown below

 A set of query blocks represents a parsed query, which is the input to the optimizer. The optimizer performs the following operations:
  1. Query transformer
The optimizer determines whether it is helpful to change the form of the query so that the optimizer can generate a better execution plan. The query transformer determines whether it is advantageous to rewrite the original SQL statement into a semantically equivalent SQL statement with a lower cost. When a viable alternative exists, the database calculates the cost of the alternatives separately and chooses the lowest-cost alternative.


 1.      Estimator
The optimizer estimates the cost of each plan based on statistics in the data dictionary.
The estimator uses three different measures to determine cost:
i. Selectivity
ii. Cardinality
iii. Cost
i. Selectivity: The percentage of rows in the row set that the query selects, with 0 meaning no rows and 1 meaning all rows. Selectivity is tied to a query predicate, such as WHERE last_name LIKE 'A%', or a combination of predicates. A predicate becomes more selective as the selectivity value approaches 0 and less selective (or more unselective) as the value approaches 1.Selectivity is an internal calculation that is not visible in the execution plans.
ii. Cardinality: The cardinality is the number of rows returned by each operation in an execution plan. This input, which is crucial to obtaining an optimal plan, is common to all cost functions. The estimator can derive cardinality from the table statistics collected by DBMS_STATS, or derive it after accounting for effects from predicates (filter, join, and so on), DISTINCT or GROUP BY operations, and so on. The Rows column in an execution plan shows the estimated cardinality.
iii. Cost: This measure represents units of work or resource used. The query optimizer uses disk I/O, CPU usage, and memory usage as units of work.


1.      Plan Generator
The optimizer compares the costs of plans and chooses the lowest-cost plan, known as the execution plan, to pass to the row source generator. It explores various plans for a query block by trying out different access paths, join methods, and join orders. Many plans are possible because of the various combinations that the database can use to produce the same result. The optimizer picks the plan with the lowest cost.


No comments:

Post a Comment