Sunday, September 21, 2008

Data Pump Parallel Operation

You can accelerate jobs significantly using more than one thread for the export, through the PARALLEL parameter. Each thread creates a separate dumpfile, so the parameter dumpfile should have as many entries as the degree of parallelism. Instead of entering each one explicitly, you can specify wildcard characters as filenames such as:

expdp bob/abc123 tables=CASES directory=DPDATA1
dumpfile=expCASES_%U.dmp parallel=4 job_name=Cases_Export

Note how the dumpfile parameter has a wild card %U, which indicates the files will be created as needed and the format will be expCASES_nn.dmp, where nn starts at 01 and goes up as needed.

In parallel mode, the status screen will show four worker processes. (In default mode, only one process will be visible.) All worker processes extract data simultaneously and show their progress on the status screen.

It's important to separate the I/O channels for access to the database files and the dumpfile directory filesystems. Otherwise, the overhead associated with maintaining the Data Pump jobs may outweigh the benefits of parallel threads and hence degrade performance. Parallelism will be in effect only if the number of tables is higher than the parallel value and the tables are big.

No comments: