oracle 9i the complete reference phần 5 potx

Within the into table clause, you can use the recnum keyword to assign a record number

to each logical record as it is read from the datafile, and that value will be inserted into the

assigned column of the table. The constant keyword allows you to assign a constant value to a

column during the load. For character columns, enclose the constant value within single quotes.

If you use the sysdate keyword, the selected column will be populated with the current system

date and time.

CheckOutDate SYSDATE

If you use the sequence option, SQL*Loader will maintain a sequence of values during the

load. As records are processed, the sequence value will be increased by the increment you

specify. If the rows fail during insert (and are sent to the bad file), those sequence values will not

be reused. If you use the max keyword within the sequence option, the sequence values will use

the current maximum value of the column as the starting point for the sequence. The following

listing shows the use of the sequence option:

Seqnum_col SEQUENCE(MAX,1)

You can also specify the starting value and increment for a sequence to use when inserting.

The following example inserts values starting with a value of 100, incrementing by 2. If a row is

rejected during the insert, its sequence value is skipped.

Seqnum_col SEQUENCE(100,2)

If you store numbers in VARCHAR2 columns, avoid using the sequence option for those

columns. For example, if your table already contains the values 1 through 10 in a VARCHAR2

column, then the maximum value within that column is 9—the greatest character string. Using

that as the basis for a sequence option will cause SQL*Loader to attempt to insert a record using

10 as the newly created value—and that may conflict with the existing record.

SQL*Loader control files can support complex logic and business rules. For example, your

input data for a column holding monetary values may have an implied decimal; 9990 would be

inserted as 99.90. In SQL*Loader, you could insert this by performing the calculation during the

data load:

money_amount position (20:28) external decimal(9) ":tax_amount/100"

See the “SQL*Loader Case Studies” of the Oracle9i Utilities Guide for additional SQL*Loader

examples and sample control files.

Managing Data Loads

Loading large data volumes is a batch operation. Batch operations should not be performed

concurrently with the small transactions prevalent in many database applications. If you have

many concurrent users executing small transactions against a table, you should schedule your

batch operations against that table to occur at a time when no users are accessing the table.

Oracle maintains read consistency for users’ queries. If you execute the SQL*Loader job

against the table at the same time that other users are querying the table, Oracle will internally

402 Part II: SQL and SQL*Plus

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 21

Blind Folio 21:402

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:02 PM

Color profile: Generic CMYK printer profile

Composite Default screen

maintain undo entries to enable those users to see their data as it existed when they first queried

the data. To minimize the amount of work Oracle must perform to maintain read consistency

(and to minimize the associated performance degradation caused by this overhead), schedule

your long-running data load jobs to be performed when few other actions are occurring in the

database. In particular, avoid contention with other accesses of the same table.

Design your data load processing to be easy to maintain and reuse. Establish guidelines for

the structure and format of the input datafiles. The more standardized the input data formats are,

the simpler it will be to reuse old control files for the data loads. For repeated scheduled loads

into the same table, your goal should be to reuse the same control file each time. Following

each load, you will need to review and move the log, bad, data, and discard files so they do not

accidentally get overwritten.

Within the control file, use comments to indicate any special processing functions being

performed. To create a comment within the control file, begin the line with two dashes, as

shown in the following example:

-- Limit the load to LA employees:

when Location='LA'

If you have properly commented your control file, you will increase the chance that it can be

reused during future loads. You will also simplify the maintenance of the data load process itself,

as described in the next section.

Repeating Data Loads

Data loads do not always work exactly as planned. Many variables are involved in a data load,

and not all of them will always be under your control. For example, the owner of the source data

may change its data formatting, invalidating part of your control file. Business rules may change,

forcing additional changes. Database structures and space availability may change, further affecting

your ability to load the data.

In an ideal case, a data load will either fully succeed or fully fail. However, in many cases,

a data load will partially succeed, making the recovery process more difficult. If some of the

records have been inserted into the table, then attempting to reinsert those records should result

in a primary key violation. If you are generating the primary key value during the insert (via the

sequence option), then those rows may not fail the second time—and will be inserted twice.

To determine where a load failed, use the log file. The log file will record the commit points

as well as the errors encountered. All of the rejected records should be in either the bad file or

the discard file. You can minimize the recovery effort by forcing the load to fail if many errors are

encountered. To force the load to abort before a large number of errors is encountered, use the

errors keyword of the SQLLDR command. You can also use the discardmax keyword to limit the

number of discarded records permitted before the load aborts.

If you set errors to 0, the first error will cause the load to fail. What if that load fails after

100 records have been inserted? You will have two options: identify and delete the inserted

records and reapply the whole load, or skip the successfully inserted records. You can use the

skip keyword of SQLLDR to skip the first 100 records during its load processing. The load will

then continue with record 101 (which, we hope, has been fixed prior to the reload attempt). If

you cannot identify the rows that have just been loaded into the table, you will need to use the

skip option during the restart process.

Chapter 21: Using SQL*Loader to Load Data 403

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 21

Blind Folio 21:403

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:02 PM

Color profile: Generic CMYK printer profile

Composite Default screen

The proper settings for errors and discardmax depend on the load. If you have full control

over the data load process, and the data is properly “cleaned” before being extracted to a load

file, you may have very little tolerance for errors and discards. On the other hand, if you do not

have control over the source for the input datafile, you need to set errors and discardmax high

enough to allow the load to complete. After the load has completed, you need to review the log

file, correct the data in the bad file, and reload the data using the original bad file as the new

input file. If rows have been incorrectly discarded, you need to do an additional load using the

original discard file as the new input file.

After modifying the errant CategoryName value, you can rerun the BOOKSHELF table load

example using the original bookshelf.dat file. During the reload, you have two options when

using the original input datafile:

■ Skip the first row by specifying skip=1 in the SQLLDR command line.

■ Attempt to load both rows, whereby the first row fails because it has

already been loaded (and thus causes a primary key violation).

Alternatively, you can use the bad file as the new input datafile and not worry about errors

and skipped rows.

Tuning Data Loads

In addition to running the data load processes at off-peak hours, you can take other steps to

improve the load performance. The following steps all impact your overall database environment,

and must be coordinated with the database administrator. The tuning of a data load should not

be allowed to have a negative impact on the database or on the business processes it supports.

First, batch data loads may be timed to occur while the database is in NOARCHIVELOG

mode. While in NOARCHIVELOG mode, the database does not keep an archive of its online

redo log files prior to overwriting them. Eliminating the archiving process improves the

performance of transactions. Since the data is being loaded from a file, you can re-create the

loaded data at a later time by reloading the datafile rather than recovering it from an archived

redo log file.

However, there are significant potential issues with disabling NOARCHIVELOG mode. You

will not be able to perform a point-in-time recovery of the database unless archiving is enabled.

If there are non-batch transactions performed in the database, you will probably need to run

the database in ARCHIVELOG mode all the time, including during your loads. Furthermore,

switching between ARCHIVELOG and NOARCHIVELOG modes requires you to shut down the

instance. If you switch the instance to NOARCHIVELOG mode, perform your data load, and

then switch the instance back to ARCHIVELOG mode, you should perform a backup of the

database (see Chapter 40) immediately following the restart.

Instead of running the entire database in NOARCHIVELOG mode, you can disable archiving

for your data load process by using the unrecoverable keyword within SQL*Loader. The

unrecoverable option disables the writing of redo log entries for the transactions within the data

load. You should only use this option if you will be able to re-create the transactions from the

input files during a recovery. If you follow this strategy, you must have adequate space to store

old input files in case they are needed for future recoveries. The unrecoverable option is only

available for Direct Path loads, as described in the next section.

404 Part II: SQL and SQL*Plus

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 21

Blind Folio 21:404

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:03 PM

Color profile: Generic CMYK printer profile

Composite Default screen

Rather than control the redo log activity at the load process level, you can control it at the

table or partition level. If you define an object as nologging, then block-level inserts performed

by SQL*Loader Direct Path loading and the insert /*+ APPEND */ command will not generate

redo log entries.

If your operating environment has multiple processors, you can take advantage of the CPUs

by parallelizing the data load. The parallel option of SQLLDR, as described in the next section,

uses multiple concurrent data load processes to reduce the overall time required to load the data.

In addition to these approaches, you should work with your database administrator to make

sure the database environment and structures are properly tuned for data loads. Tuning efforts

should include the following:

■ Preallocate space for the table, to minimize dynamic extensions during the loads.

■ Allocate sufficient memory resources to the shared memory areas, including the log

buffer area.

■ Streamline the data writing process by creating multiple database writer (DBWR)

processes for the database.

■ Remove any unnecessary triggers during the data loads. If possible, disable or remove

the triggers prior to the load, and perform the trigger operations on the loaded data

manually after it has been loaded.

■ Remove or disable any unnecessary constraints on the table. You can use SQL*Loader

to dynamically disable and re-enable constraints.

■ Remove any indexes on the tables. If the data has been properly cleaned prior to the

data load, then uniqueness checks and foreign key validations will not be necessary

during the loads. Dropping indexes prior to data loads significantly improves

performance.

If you leave indexes on during a data load, Oracle must manage and rebalance the index

with each inserted record. The larger your data load is, the more work Oracle will have to do to

manage the associated indexes. If you can, you should consider dropping the indexes prior to the

load and then re-creating them after the load completes. The only time indexes do not cause a

penalty for data load performance is during a Direct Path load, as described in the next section.

Direct Path Loading

SQL*Loader, when inserting records, generates a large number of insert statements. To avoid the

overhead associated with using a large number of inserts, you may use the Direct Path option in

SQL*Loader. The Direct Path option creates preformatted data blocks and inserts those blocks

into the table. As a result, the performance of your load can dramatically improve. To use the

Direct Path option, you must not be performing any functions on the values being read from the

input file.

Any indexes on the table being loaded will be placed into a temporary DIRECT LOAD state

(you can query the index status from USER_INDEXES). Oracle will move the old index values to

a temporary index it creates and manages. Once the load has completed, the old index values

will be merged with the new values to create the new index, and Oracle will drop the temporary

index it created. When the index is once again valid, its status will change to VALID. To minimize

Chapter 21: Using SQL*Loader to Load Data 405

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 21

Blind Folio 21:405

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:03 PM

Color profile: Generic CMYK printer profile

Composite Default screen

the amount of space necessary for the temporary index, presort the data by the indexed columns.

The name of the index for which the data is presorted should be specified via a sorted indexes

clause in the control file.

To use the direct path option, specify

DIRECT=TRUE

as a keyword on the SQLLDR command line or include this option in the control file.

If you use the Direct Path option, you can use the unrecoverable keyword to improve your

data load performance. This instructs Oracle not to generate redo log entries for the load. If you

need to recover the database at a later point, you will need to re-execute the data load in order

to recover the table’s data. All conventional path loads are recoverable, and all Direct Path loads

are recoverable by default.

Direct Path loads are faster than conventional loads, and unrecoverable Direct Path loads are

faster still. Since performing unrecoverable loads impacts your recovery operations, you need to

weigh the costs of that impact against the performance benefit you will realize. If your hardware

environment has additional resources available during the load, you can use the parallel Direct

Path load option to divide the data load work among multiple processes. The parallel Direct Path

operations may complete the load job faster than a single Direct Path load.

Instead of using the parallel option, you could partition the table being loaded (see Chapter 18).

Since SQL*Loader allows you to load a single partition, you could execute multiple concurrent

SQL*Loader jobs to populate the separate partitions of a partitioned table. This method requires

more database administration work (to configure and manage the partitions), but it gives you

more flexibility in the parallelization and scheduling of the load jobs.

As of Oracle9i, you can take advantage of multithreaded loading functionality for Direct

Path loads to convert column arrays to stream buffers and perform stream buffer loading in

parallel. Use the streamsize parameter and multithreading flag to enable this feature.

Direct Path loading may impact the space required for the table’s data. Since Direct Path

loading inserts blocks of data, it does not follow the usual methods for allocating space within

a table. The blocks are inserted at the end of the table, after its high-water mark, which is the

highest block into which the table’s data has ever been written. If you insert 100 blocks worth

of data into a table and then delete all of the rows, the high-water mark for the table will still be

set at 100. If you then perform a conventional SQL*Loader data load, the rows will be inserted

into the already allocated blocks. If you instead perform a Direct Path load, Oracle will insert

new blocks of data following block 100, potentially increasing the space allocation for the table.

The only way to lower the high-water mark for a table is to truncate it (which deletes all rows

and cannot be rolled back) or to drop and re-create it. You should work with your database

administrator to identify space issues prior to starting your load.

Additional Oracle9i Enhancements

In addition to features noted earlier in this chapter, SQL*Loader features support for Unicode and

expanded datatypes. As of Oracle9i, SQL*Loader can load integer and zoned/packed decimal

datatypes across platforms with different byte ordering and accept EBCDIC-based zoned or

packed decimal data encoded in IBM format. SQL*Loader also offers support for loading XML

columns, loading object types with subtypes (see Chapter 30), and Unicode (UTF16 character set).

406 Part II: SQL and SQL*Plus

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 21

Blind Folio 21:406

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:03 PM

Color profile: Generic CMYK printer profile

Composite Default screen

SQL*Loader also provides native support for the new Oracle9i date, time, and interval-related

datatypes (see Chapter 9).

If a SQL*Loader job fails, you may be able to resume it where it failed using the resumable,

resumable_name, and resumable_timeout options. For example, if the segment to which the

loader job was writing could not extend, you can disable the load job, fix the space allocation

problem, and resume the job. Your ability to perform these actions depends on the configuration

of the database; work with your DBA to make sure the resumable features are enabled and

adequate undo history is maintained for your purposes.

As of Oracle9i, you can access external files as if they are tables inside the database. This

“external table” feature, described in Chapter 25, allows you to potentially avoid loading large

volumes of data into the database. The syntax for external table definitions very closely resembles

that of the SQL*Loader control file. Although they are limited in some significant ways (you cannot

perform DML on external tables, for example), you should consider external tables as alternatives

to data loads. See Chapter 25 for implementation details.

Chapter 21: Using SQL*Loader to Load Data 407

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 21

Blind Folio 21:407

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:04 PM

Color profile: Generic CMYK printer profile

Composite Default screen

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 21

Blind Folio 21:408

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:04 PM

Color profile: Generic CMYK printer profile

Composite Default screen

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 22

Blind Folio 22:409

CHAPTER

Accessing

Remote Data

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:04 PM

Color profile: Generic CMYK printer profile

Composite Default screen

A s your databases grow in size and number, you will very likely need to share data

among them. Sharing data requires a method of locating and accessing the data. In

Oracle, remote data accesses such as queries and updates are enabled through the

use of database links. As described in this chapter, database links allow users to treat

a group of distributed databases as if they were a single, integrated database. In this

chapter, you will also find information about direct connections to remote databases, such as those

used in client-server applications.

Database Links

Database links tell Oracle how to get from one database to another. You may also specify the

access path in an ad hoc fashion (see “Dynamic Links: Using the SQLPLUS copy Command,”

later in this chapter). If you will frequently use the same connection to a remote database, then

a database link is appropriate.

How a Database Link Works

A database link requires that Oracle Net (previously known as SQL*Net and Net8) be running on

each of the machines (hosts) involved in the remote database access. Oracle Net is usually started

by the database administrator (DBA) or the system manager. A sample architecture for a remote

access using a database link is shown in Figure 22-1. This figure shows two hosts, each running

Oracle Net. There is a database on each of the hosts. A database link establishes a connection from

the first database (named LOCAL, on the Branch host) to the second database (named REMOTE, on

the Headquarters host). The database link shown in Figure 22-1 is located in the Local database.

Database links specify the following connection information:

■ The communications protocol (such as TCP/IP) to use during the connection

■ The host on which the remote database resides

■ The name of the database on the remote host

■ The name of a valid account in the remote database

■ The password for that account

410 Part II: SQL and SQL*Plus

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 22

Blind Folio 22:410

FIGURE 22-1. Sample architecture for a database link

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:05 PM

Color profile: Generic CMYK printer profile

Composite Default screen

When used, a database link actually logs in as a user in the remote database, and then logs

out when the remote data access is complete. A database link can be private, owned by a single

user, or public, in which case all users in the Local database can use the link.

The syntax for creating a database link is shown in “Syntax for Database Links,” later in this

chapter.

Using a Database Link for Remote Queries

If you are a user in the Local database shown in Figure 22-1, you can access objects in the Remote

database via a database link. To do this, simply append the database link name to the name of any

table or view that is accessible to the remote account. When appending the database link name to

a table or view name, you must precede the database link name with an @ sign.

For local tables, you reference the table name in the from clause:

select *

from BOOKSHELF;

For remote tables, use a database link named REMOTE_CONNECT. In the from clause,

reference the table name followed by @REMOTE_CONNECT:

select *

from BOOKSHELF@REMOTE_CONNECT;

When the database link in the preceding query is used, Oracle will log in to the database

specified by the database link, using the username and password provided by the link. It will then

query the BOOKSHELF table in that account and return the data to the user who initiated the

query. This is shown graphically in Figure 22-2. The REMOTE_CONNECT database link shown

in Figure 22-2 is located in the Local database.

As shown in Figure 22-2, logging in to the Local database and using the REMOTE_CONNECT

database link in your from clause returns the same results as logging in directly to the remote database

and executing the query without the database link. It makes the remote database seem local.

NOTE

The maximum number of database links that can be used in a

single query is set via the OPEN_LINKS parameter in the database’s

initialization parameter file. This parameter defaults to four.

Chapter 22: Accessing Remote Data 411

ORACLE Series TIGHT / Oracle9i: The Complete Reference / Loney, Koch / 222521-1 / Chapter 22

Blind Folio 22:411

FIGURE 22-2. Using a database link for a remote query

P:\010Comp\Oracle8\521-1\CD\Ventura\book.vp

Friday, July 19, 2002 4:13:05 PM

Color profile: Generic CMYK printer profile

Composite Default screen

Thư viện tri thức trực tuyến

oracle 9i the complete reference phần 5 potx

Nội dung xem thử

Mô tả chi tiết

Tài liệu tương tự (6)

oracle 9i the complete reference phần 1 pps

oracle 9i the complete reference phần 7 doc

oracle 9i the complete reference phần 3 pps

oracle 9i the complete reference phần 6 doc

oracle 9i the complete reference phần 8 doc

oracle 9i the complete reference phần 2 ppsx