7 Apr 2016

How to tune Insert statement?

Overview
The CREATE TABLE ... AS SELECT statement (CTAS) is a powerful tool for manipulating large sets of data. CTAS provides a mechanism for efficiently executing a SQL query and storing the results of that query in a new database table.
The INSERT /*+APPEND*/ ... AS SELECT statement offers the same capabilities with existing database tables.
APPEND into Tables
By using the APPEND hint, you ensure that Oracle always grabs "fresh" data blocks by raising the high-water-mark for the table. If you are doing parallel insert DML, the Append mode is the default and you don't need to specify an APPEND hint.
When you use the APPEND hint for INSERT, data is simply appended to a table above the HWM which has the effect of not creating UNDO. Existing free space in blocks is not used.
High Water Mark
The high water mark is the boundary between used and unused space in a segment. As requests for new free blocks that cannot be satisfied by existing free lists are received, the block to which the high water mark points becomes a used block, and the high water mark is advanced to the next block. In other words, the segment space to the left of the high water mark is used, and the space to the right of it is unused.
Suggest Way for fast INSERTs

a - Disable/drop indexes and constraints - It's far faster to rebuild indexes after the data load, all at-once. Also indexes will rebuild cleaner, and with less I/O if they reside in a tablespace with a large block size.

b - Manage segment header contention for parallel inserts - Make sure to define multiple freelist (or freelist groups) to remove contention for the table header. Multiple freelists add additional segment header blocks, removing the bottleneck.  You can also use Automatic Segment Space Management (bitmap freelists) to support parallel DML, but ASSM has some limitations.

c - Parallelize the load - You can invoke parallel DML (i.e. using the PARALLEL and APPEND hint) to have multiple inserts into the same table. For this INSERT optimization, make sure to define multiple freelists and use the SQL "APPEND" option.  Mark Bobak notes that if you submit parallel jobs to insert against the table at the same time, using the APPEND hint may cause serialization, removing the benefit of parallel jobstreams.


d - APPEND into tables - By using the APPEND hint, you ensure that Oracle always grabs "fresh" data blocks by raising the high-water-mark for the table. If you are doing parallel insert DML, the Append mode is the default and you don't need to specify an APPEND hint.  Mark Bobak notes "Also, if you're going w/ APPEND, consider putting the table into NOLOGGING mode, which will allow Oracle to avoid almost all redo logging."

insert /*+ append */ into customer values ('hello',';there');


e - Use a large blocksize - By defining large (i.e. 32k) blocksizes for the target table, you reduce I/O because more rows fit onto a block before a "block full" condition (as set by PCTFREE) unlinks the block from the freelist.

See benchmark test of blocksize and inserts here


See general benefits of multiple blocksizes here

f - Use NOLOGGING

  RAM disk - You can use high-speed solid-state disk (RAM-SAN) to make Oracle inserts run up to 300x faster than platter disk.

  1. Mark indexes unuasble
  1. Disable primary key
  1. Alter table nologging
  1. Do an insert /*+ append */ into table (select ..)
  1. Enable primary key
  1. Rebuild indexes nologging
Example
1.  First create a Big Table for this Test
create table bigtab
as
select rownum id, a.*
  from all_objects a
 where 1=0;
alter table bigtab nologging;
declare
    l_cnt number;
    l_rows number := 1000000;
begin
    insert /*+ append */
    into bigtab
    select rownum, a.*
      from all_objects a;
    l_cnt := sql%rowcount;
    commit;
    while (l_cnt < l_rows)
    loop
        insert /*+ APPEND */ into bigtab
        select rownum+l_cnt,
               OWNER, OBJECT_NAME, SUBOBJECT_NAME,
               OBJECT_ID, DATA_OBJECT_ID,
               OBJECT_TYPE, CREATED, LAST_DDL_TIME,
               TIMESTAMP, STATUS, TEMPORARY,
               GENERATED, SECONDARY
          from bigtab
         where rownum <= l_rows-l_cnt;
        l_cnt := l_cnt + sql%rowcount;
        commit;
    end loop;
end;
/
Now we have a Table with 1'000'000 Rows. Now delete some Rows, to force Oracle to refill this freespace using the FREELISTS in a normal INSERT. However in APPEND Mode the FREELISTS are not used and the freespace are not reoccupied.
DELETE FROM bigtab WHERE id between 1000 and 2500;
DELETE FROM bigtab WHERE id between 3500 and 6500;
DELETE FROM bigtab WHERE id between 15000 and 20000;
DELETE FROM bigtab WHERE id between 350000 and 370000;
COMMIT;

CREATE TABLE insert_test AS SELECT * FROM bigtab;
2.  Test with normal Insert
SQL> CREATE TABLE insert_test AS SELECT * FROM bigtab;
Table created.

SQL> INSERT INTO insert_test SELECT rownum, a.* FROM all_objects a;
49483 rows created.

Elapsed: 00:00:03.92
   <==================== !

SQL> DROP TABLE insert_test;
Table dropped.
3.  Test with APPEND Hint and NOLOGGING
SQL> CREATE TABLE insert_test AS SELECT * FROM bigtab;
Table created.

SQL> ALTER TABLE insert_test NOLOGGING;
Table altered.

SQL> INSERT /*+ append */ INTO insert_test SELECT rownum, a.* FROM all_objects a;
49483 rows created.

Elapsed: 00:00:02.54
    <==================== !
As you can see, only to insert about 50'000 Rows, the APPEND Insert is much faster, due the free space in the Oracle blocks are not refilled, the Rows are appended and the Highwater Mark is moved.

No comments:

Post a Comment