Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Tài liệu Managing time in relational databases- P20 doc
Nội dung xem thử
Mô tả chi tiết
SELECT data
FROM mytable
WHERE SSN ¼ :my-ssn
AND eff_beg_dt <¼ :my-as-of-dt
AND eff_end_dt > :my-as-of-dt
AND asr_beg_dt <¼ :my-as-of-dt
AND assertion end date > :my-as-of-dt
AND circa_asr_flag IN (‘Y’, ‘N’)
In processing this query, a DB2 optimizer will first match on
SSN. After that, still using the index tree rather than a scan, it will
look aside for the effective end date under the ‘Y’ value for the
circa flag, and then repeat the process for the ‘N’ value. This uses
a matchcols of three; whereas without the IN clause, an index
scan would begin right after the SSN match. However, we only
recommend this for SQL where :my_as_of_dt is not guaranteed
to be Now(). When that as-of date is Now(), using the EQUALS
predicate ({circa_asr_flag ¼ ‘Y’}) will perform much better since
the ‘N’s do not need to be analyzed.
Query-enhancing indexes like these are not always needed.
For the most part, as we said earlier, these indexes are specifically designed to improve the performance of queries that are
looking for the currently asserted current versions of the objects
they are interested in, and in systems that require extremely high
read performance.
Indexes to Optimize Temporal Referential Integrity
Temporal referential integrity (TRI) is enforced in two directions. On the insert or temporal expansion of a child managed
object, or on a change in the parent object designated by its temporal foreign key, we must insure that the parent object is present in every clock tick in which the child object is about to be
present. On the deletion or temporal contraction of a parent
managed object, we must RESTRICT, CASCADE or SET NULL
that transformation so that it does not leave any “temporal
orphans” after the transaction is complete.
In this section, we will discuss the performance considerations involved in creating indexes that support TRI checks
on both parent and child managed objects.
Asserted Versioning’s Non-Unique Primary Keys
First, and most obviously, each parent table needs an index
whose initial column will be that table’s object identifier (oid).
The object identifier is also the initial column of the primary
366 Chapter 15 OPTIMIZING ASSERTED VERSIONING DATABASES
key (PK) of all asserted version tables. It is followed by two other
primary key components, the effective begin date and the assertion begin date.
We need to remember that these physical PKs do not explicitly define the logical primary keys used by the AVF because
the AVF uses date ranges and not specific dates or pairs of dates.
Because of this, a unique index on the primary key of an asserted
version table does not guarantee temporal entity integrity. These
primary keys guarantee physical uniqueness; they guarantee that
no two rows will have identical primary key values. But they do
not guarantee semantic uniqueness, because they do not prevent
multiple rows with the same object identifier from specifying
[overlapping] or otherwise [intersecting] time periods.
The PK of an asserted version table can be any column or
combination of columns that physically distinguish each row
from all the other rows in the table. For example, the PK could
be the object identifier plus a sequence number. It could be a
single surrogate identity key column. It could be a business key
plus the row create date. We have this freedom of choice because
asserted version tables more clearly distinguish between semantically unique identifiers and physically unique identifiers than
do conventional tables.
But this very freedom of choice poses a serious risk to any
business deciding to implement its own Asserted Versioning
framework. It is the risk of implementing Asserted Versioning’s
concepts one project at a time, one database at a time, one set
of queries and maintenance transactions at a time. It is the risk
of proliferating point solutions, each of which may work
correctly, but which together pose serious difficulties for queries
which range across two or more of those databases. It is the risk
of failing to create an enterprise implementation of bi-temporal
data management.
The semantically unique identifier for any asserted version
table is the combination of the table’s object identifier and its
two time periods. And to emphasize this very important point
once again: two pairs of dates are indeed used to represent two
time periods, but they are not equivalent to two time periods.
What turns those pairs of dates into time periods is the Asserted
Versioning code which guarantees that they are treated as the
begin and end delimiters for time periods.
Given that there should be one enterprise-wide approach for
Asserted Versioning primary keys, what should it be? First of all,
an enterprise approach requires that the PK of an asserted version table must not contain any business data. The reason is that
if business data were used, we could not guarantee that the same
Chapter 15 OPTIMIZING ASSERTED VERSIONING DATABASES 367