source: http://www.akadia.com/services/ora_parallel_processing.html
Parallel Features
The full list of Oracle parallel execution features currently includes the following
Oracle can parallelize operations that involve processing an entire table or an entire
partition. These operations include:
-
SQL queries requiring at least one full table scan or queries involving an index
range scan spanning multiple partitions.
-
Operations such as creating or rebuilding an index or rebuilding one or more
partitions of an index.
-
Partition operations such as moving or splitting partitions
-
CREATE TABLE AS SELECT operations, if the SELECT involves
a full table or partition scan.
-
INSERT INTO . . . SELECT operations, if the SELECT involves a full table or
partition scan.
-
Update and delete operations on partitioned tables
Parallel Query
Parallel query is the most commonly used of Oracle's parallel execution features. It
was the first parallel execution feature to be developed by Oracle and was introduced
in Oracle Release 7.1 as the Oracle Parallel Query Option (PQO). Parallel execution
can significantly reduce the elapsed time for large queries, but it doesn't apply to
every query.
To parallelize a SELECT statement, the following conditions must be met:
-
If the execution involves a full table scan, the statement must contain a
PARALLEL hint specifying the corresponding table, or the corresponding table must
have a parallel declaration in its definition.
-
If the execution involves an index range scan spanning multiple partitions, the
statement must contain a PARALLEL_INDEX hint specifying the corresponding index,
or the corresponding index must have a parallel declaration in its definition.
The following two sections explain how the degree of parallelism is chosen for a
SELECT statement and discuss restrictions on the use of the parallel query feature.
Setting the Degree of Parallelism
Once Oracle decides to execute a SELECT statement in parallel, the degree of
parallelism is determined by following precedence rules:
You can use the PARALLEL and PARALLEL_INDEX hints to specify the degree
of parallelism for a SELECT statement. You can use the NOPARALLEL and
NOPARALLEL_INDEX hints to ensure that parallel execution is not performed.
Example
alter table emp parallel (degree 4);
select degree from user_tables where table_name = 'EMP';
select count(*) from emp;
alter table emp noparallel;
SELECT /*+ PARALLEL(emp,4) */ COUNT(*)
FROM emp;
Parallel DML
Data Manipulation Language (DML) operations such as INSERT, UPDATE, and DELETE can be
parallelized by Oracle. Parallel execution can speed up large DML operations and is
particularly advantageous in data warehousing environments where it's necessary to
maintain large summary or historical tables. In OLTP systems, parallel DML sometimes
can be used to improve the performance of long-running batch jobs.
When you issue a DML statement such as an INSERT, UPDATE, or DELETE, Oracle applies a
set of rules to determine whether that statement can be parallelized. For UPDATE and
DELETE statements, the rules are identical. INSERT statements, however, have their
own set of rules.
Rules for UPDATE and DELETE statements
Rules for INSERT statements
Examples
alter session enable parallel dml;
insert /*+ parallel (emp_big,4,1) */
into emp_big select * from emp;
commit;
alter session disable parallel dml;
Parallel DDL
Parallel DDL works for both tables and indexes, whether partitioned or
nonpartitioned.
For nonpartitioned tables and indexes, only the following types of DDL
statements can be parallelized:
CREATE TABLE...AS SELECT
CREATE INDEX
ALTER INDEX...REBUILD
If you're working with partitioned tables and indexes, the scope of Oracle's
parallel DDL support broadens. The following statements can be parallelized for
partitioned tables and indexes:
CREATE TABLE...AS SELECT
ALTER TABLE...MOVE PARTITION
ALTER TABLE...SPLIT PARTITION
CREATE INDEX
ALTER INDEX...REBUILD PARTITION
ALTER INDEX...SPLIT PARTITION
Not all tables allow these operations to be executed in parallel. Tables with object
columns or LOB columns don't allow parallel DDL.
Example
create table big_emp parallel (degree 4)
as select * from emp;
CREATE INDEX emp_ix ON emp (emp_id)
TABLESPACE ind
STORAGE (INITIAL 1M NEXT 1M PCTINCREASE 0 MAXEXTENTS 20)
PARALLEL (DEGREE 4);
Parallel Data Loading
Oracle's SQL*Loader utility loads data into Oracle tables from external files. With
some restrictions, SQL*Loader supports the loading of data in parallel. If you have a
large amount of data to load, SQL*Loader's parallel support can dramatically reduce
the elapsed time needed to perform that load.
SQL*Loader supports parallel loading by allowing you to initiate multiple concurrent
direct path load sessions that all load data into the same table or into the same
partition of a partitioned table. Unlike the case when you execute a SQL statement in
parallel, the task of dividing up the work falls on your shoulders. Follow these
steps to use parallel data loading:
-
Create multiple input datafiles.
-
Create a SQL*Loader control file for each input datafile.
-
Initiate multiple SQL*Loader sessions, one for each control file
and datafile pair.
Example
SQLLOAD scott/tiger CONTROL=con1.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=con2.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=con3.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=con4.ctl DIRECT=TRUE PARALLEL=TRUE
Note that the commands here should be executed from four different operating system
sessions. The intent is to get four SQL*Loader sessions going at once, not to run
four sessions one at a time. For example, if you are using the Unix operating system,
you might open four command-prompt windows and execute one SQL*Loader command in each
window.
Another important thing to note here is that you need to use the direct path in order
to perform a load in parallel, as explained in the next section. This is achieved by
the command-line argument DIRECT=TRUE. Parallel loads are not possible using the
conventional path option.
Parallel Recovery
Parallel recovery can speed up both instance recovery and media recovery. In parallel
recovery, multiple parallel slave processes are used to perform recovery operations.
The SMON background process reads the redo log files, and the parallel slave
processes apply the changes to the datafiles.
In a serial recovery scenario, the SMON background process both reads the redo log
files and applies the changes to the datafiles. This may take a considerably long
time when multiple datafiles need to be recovered. However, when parallel recovery is
being used, the SMON process is responsible only for reading the redo log files. The
changes are applied to the datafiles by multiple parallel slave processes,
thereby reducing the recovery time.
Recovery requires that the changes be applied to the datafiles in exactly the same
order in which they occurred. This is achieved by single-threading the read phase of
the recovery process by the SMON process. SMON reads the redo log files and
serializes the changes before dispatching them to the parallel slave processes. The
parallel slave processes then apply those changes to the datafiles in the proper
order. Therefore, the reading of the redo log files is performed serially even during
a parallel recovery operation.
Specifying the RECOVERY_PARALLELISM Parameter
The RECOVERY_PARALLELISM initialization parameter controls the degree of parallelism
to use for a recovery. You can override that setting for a specific situation
by using the RECOVER command's PARALLEL clause.
A value of or 1 indicates serial recovery, no parallelism will be used. The
RECOVERY_PARALLELISM parameter setting cannot exceed the PARALLEL_MAX_SERVERS
setting.
Example
RECOVER TABLESPACE tab PARALLEL (DEGREE 4);
RECOVER DATABASE PARALLEL (DEGREE DEFAULT);
Parallel Replication
Oracle provides replication mechanisms allowing you to maintain copies of database
objects in multiple databases. Changes are propagated among these databases over
database links. The SNP (snapshot) background processes perform the replication
process. For large volumes of replicated data, parallel propagation can be used to
enhance throughput.
With parallel propagation, Oracle enlists multiple parallel slave processes to
propagate replicated transactions using multiple parallel streams. Oracle orders the
dependent transactions properly based on the System Change Number (SCN). During
parallel propagation, you can see multiple connections to the destination database.
You enable parallel replication propagation at the database link level. A
database link is created for a particular destination database. When you enable
parallel propagation for a database link, Oracle uses multiple parallel slave
processes to replicate to the corresponding destination.
Enable Parallel Replication Propagation
To enable parallel replication propagation from the SQL*Plus command line, you need
to use the Oracle built-in package DBMS_DEFER_SYS. Execute the
DBMS_DEFER_SYS.SCHEDULE_PUSH procedure for the destination database link, and pass
the desired degree of parallelism as the value for the parallelism argument.
Example for SQL*Plus
EXECUTE DBMS_DEFER_SYS.SCHEDULE_PUSH (-
DESTINATION => 'por1.world', -
INTERVAL => 'SYSDATE+1/24', -
NEXT_DATE => 'SYSDATE+1/24', -
PARALLELISM => 6);
This example sets the degree of parallelism to 6 for propagating to the "por1.world"
destination database.
How Parallel Execution Works
Oracle divides the task of executing a SQL statement into multiple smaller units, each
of which is executed by a separate process. When parallel execution is used, the
user's shadow process takes on the role of the parallel coordinator. The parallel
coordinator is also referred to as parallel execution coordinator or query
coordinator.
The parallel coordinator does the following:
-
Dynamically divides the work into smaller units that can be parallelized.
-
Acquires a sufficient number of parallel processes to execute the individual
smaller units. These parallel processes are called parallel slave processes.
They also are sometimes referred to as parallel execution server processes,
parallel server processes, parallel query slaves, or simply slave processes. The
most common of the terms, parallel slave processes and slave processes, are used
throughout this book.
-
Assigns each unit of work to a slave process.
-
Collects and combines the results from the slave processes, and returns
those
results to the user process.
-
Releases the slave processes after the work is done.
Oracle maintains a pool of parallel slave processes for each instance. The parallel
coordinator for a SQL statement assigns parallel tasks to slave processes from this
pool. These parallel slave processes remain assigned to a task until its execution is
complete. After that, these processes return to the pool and can be assigned tasks from
some other parallel operation. A parallel slave process serves only one SQL statement
at a time.
The following parameters control the number of parallel slave processes in the pool:
Specifies the minimum number of parallel slave processes for an instance. When an
instance starts up, it creates the specified number of parallel slave processes. The
default value for this parameter is 0, meaning that no slave processes would be
created at startup.
Specifies the maximum number of parallel slave processes that an instance is allowed
to have at one time. The default value for PARALLEL_MAX_SERVERS is platform-specific.
It takes time and resources to create parallel slave processes. Since parallel slave
processes can serve only one statement at a time, you should set PARALLEL_MIN_SERVERS
to a relatively high value if you need to run lots of parallel statements concurrently.
That way, performance won't suffer from the need to constantly create slave processes.
You also need to consider how to set PARALLEL_MAX_SERVERS. Each parallel slave process
consumes memory. Setting PARALLEL_MAX_SERVERS too high may lead to memory shortages
during peak usage times. On the other hand, if PARALLEL_MAX_SERVERS is set too low,
some operations may not get a sufficient number of parallel slave processes.
The number of parallel slave processes associated with an operation is called its
degree of parallelism . Don't confuse this term with the DEGREE keyword. They aren't
exactly the same thing. In Oracle, the degree of parallelism consists of two
components, the number of instances to use and the number of slave processes to use on
each instance. In Oracle's SQL syntax, the keywords INSTANCES and DEGREE are always
used to specify values for these two components as follows:
INSTANCES applies only to the Oracle Parallel Server configuration. Unless you
are using OPS, the value of INSTANCES should be set to 1; any other value is
meaningless.
The degree of parallelism used for a SQL statement can be specified at three different
levels:
Using hints or the PARALLEL clause
Found in the definition of the table, index, or other object
Using default values for the instance
Oracle determines the degree of parallelism to use for a SQL statement by checking each
item in this list in the order shown. Oracle first checks for a degree of parallelism
specification at the statement level. If it can't find one, it then checks the table or
index definition. If the table or index definition does not explicitly specify values
for DEGREE and INSTANCES, Oracle uses the default values established for the instance.
You can specify the degree of parallelism at the statement level by using hints or by
using a PARALLEL clause. PARALLEL and PARALLEL_INDEX hints are used to specify the
degree of parallelism used for queries and DML statements. However, DDL statements that
support parallel execution provide an explicit PARALLEL clause in their syntax.
SELECT /*+ PARALLEL(orders,4,1) */ COUNT(*)
FROM orders;
You can specify the degree of parallelism to use for a table or an index when you
create it. You do that by using the PARALLEL clause of the CREATE TABLE and CREATE
INDEX statements.
You also can specify a PARALLEL clause when you alter a table or an index.
ALTER TABLE order_items PARALLEL (DEGREE 4);
When you specify DEGREE and INSTANCES values at the table or index level, those values
are used for all SQL statements involving the table or index unless overridden by a
hint.
Each instance has associated with it a set of default values for DEGREE and INSTANCES.
The default DEGREE value is either the number of CPUs available or the number of disks
upon which a table or index is stored, whichever is less.
Oracle will use the instance-level defaults whenever the keyword DEFAULT is used
in a hint or in a table or index definition. Oracle also will use the instance-level
defaults when there are no hints and when no degree of parallelism has been specified
at the table or index level.
|