Sunday, September 30, 2018

Parallel DDL and DML (create table as select)

Oracle Parallel DDL and DML Operations

Oracle Database provides powerful capabilities for parallelizing both Data Definition Language (DDL) and Data Manipulation Language (DML) operations. This can significantly reduce the execution time for large operations such as creating tables from existing data, or inserting large volumes of data.

Parallel DDL: CREATE TABLE AS SELECT

Parallel DDL allows the creation of database objects (like tables, indexes) to be performed by multiple parallel execution servers. This is particularly beneficial when creating a new table based on a SELECT statement from a large existing table.

Example: Creating a new table APP_SCHEMA2.APP_TABLE2 from APP_SCHEMA1.APP_TABLE1 in parallel.

CREATE TABLE APP_SCHEMA2.APP_TABLE2 PARALLEL 15 AS
SELECT /*+ PARALLEL(A,15) */ *
FROM APP_SCHEMA1.APP_TABLE1 A;

Explanation:

  • PARALLEL 15 (in CREATE TABLE clause): This specifies the default degree of parallelism (DOP) for the newly created table APP_SCHEMA2.APP_TABLE2. Any subsequent DML or DDL on this table might use this DOP if parallel execution is enabled.

  • /*+ PARALLEL(A,15) */ (hint in SELECT clause): This hint explicitly tells the optimizer to use 15 parallel execution servers for the SELECT operation on APP_SCHEMA1.APP_TABLE1. The CREATE TABLE AS SELECT operation will then leverage this parallelism to read data and populate the new table concurrently.

Benefits: Faster table creation, especially for very large tables, by distributing the work among multiple processes.

Parallel DML: INSERT Operations

Parallel DML enables INSERT, UPDATE, DELETE, and MERGE statements to be executed by multiple parallel execution servers. For INSERT statements, this is most effective with direct-path inserts.

Prerequisite: Enable parallel DML for the session.

ALTER SESSION ENABLE PARALLEL DML;

Example: Inserting data from APP_USER1.APP_TABLE1 into APP_USER2.APP_TABLE2 using parallel DML.

INSERT /*+ APPEND PARALLEL(A,60) */ INTO APP_USER2.APP_TABLE2 A
SELECT /*+ PARALLEL(B,60) */ *
FROM APP_USER1.APP_TABLE1 B;

Explanation:

  • ALTER SESSION ENABLE PARALLEL DML;: This statement is mandatory to enable parallel DML for the current session. Without it, the parallel hints in the DML statement will be ignored.

  • /*+ APPEND */: This hint instructs Oracle to perform a direct-path insert. Direct-path inserts write data directly to the datafiles, bypassing the buffer cache, which is significantly faster for bulk data loading. It also acquires an exclusive table lock during the operation.

  • /*+ PARALLEL(A,60) */ (hint in INSERT clause): This hint specifies that the INSERT operation on APP_USER2.APP_TABLE2 should use 60 parallel execution servers.

  • /*+ PARALLEL(B,60) */ (hint in SELECT clause): This hint specifies that the SELECT operation on APP_USER1.APP_TABLE1 should also use 60 parallel execution servers.

Benefits: Dramatically faster data insertion for large datasets by leveraging multiple CPUs and I/O channels.

Important Considerations

  • Resource Consumption: Parallel operations consume more CPU, memory, and I/O resources. Ensure your system has sufficient capacity.

  • Table Locking: Direct-path inserts (APPEND hint) acquire an exclusive lock on the target table, preventing other DML operations on that table until the insert completes.

  • Undo Generation: Direct-path inserts generate minimal undo, which is another reason for their speed.

  • Degree of Parallelism (DOP): The optimal DOP depends on your hardware (number of CPU cores, I/O bandwidth) and the nature of the operation. Experimentation is often required to find the best value.

  • Indexing: Indexes on the target table will be maintained during parallel DML, but this can impact performance. For very large inserts, it might be faster to drop indexes, perform the insert, and then rebuild the indexes.

  • Transaction Management: Parallel DML operations are part of a transaction and require a COMMIT or ROLLBACK.

By strategically using parallel DDL and DML, you can significantly improve the performance of large-scale data operations in your Oracle database.