The system catalog, or the DB2 Catalog, offers a wealth of information about DB2. If the DB2 optimizer is the heart and soul of DB2, the DB2 Catalog is its memory. The knowledge base of every object known to DB2 is stored in the DB2 Catalog, along with the DB2 Directory and the BSDS (Bootstrap Data Set).
The tables in the DB2 Catalog collectively describe the objects and resources available to DB2. You can think of the DB2 Catalog as a metadata repository for your DB2 databases. As of Version 10, the DB2 Catalog is composed of 90 table spaces and 137 tables all in a single database named DSNDB06. These numbers have grown considerably since the early days of DB2. The DB2 Catalog consisted of 25 tables in 11 table spaces for the first version of DB2 and as recently as DB2 V8, there were only 21 table spaces and 87 tables. The following table runs down the history:
Over the course of the past couple releases, the DB2 Catalog has undergone many significant changes. For most of its life, the DB2 Catalog contained many multi-table table spaces. As of DB2 10 for z/OS, IBM made an effort to clean that up, and now only a few table spaces are in the DB2 Catalog with more than one table defined. As of V10, most of the table spaces in the DB2 Catalog are now universal table spaces. In addition, the DB2 Catalog now must be SMS-managed.
Even as many new tables have been added to the DB2 Catalog to support new features such as trusted context, XML, and access path management, some tables have been removed. The SYSPROCEDURES table, which was used to register stored procedures in earlier version of DB2, was removed as of DB2 V9. And the SYSLINKS table was removed for DB2 V10.
The SYSLINKS table was used to record the
links (or pointers) that existed in several of the older DB2 Catalog table
spaces (SYSDBASE, SYSPLAN, SYSDBAUT, SYSVIEW, SYSGROUP), as well as in the DB2
Directory (DBD01). Links were used to tie tables together hierarchically—not
unlike an IMS database—using a special type of relationship. However, links are
obsolete in DB2 as of V10.
Each DB2 Catalog table
maintains data about an aspect of the DB2 environment. In that respect, the DB2
Catalog functions as a data dictionary for DB2, supporting and maintaining data
about the DB2 environment. The DB2 Catalog records all the information required by DB2 for the
following functional areas:
- Database Objects: Storage groups, databases, table spaces, partitions, tables, auxiliary tables, columns, user-defined distinct types, views, synonyms, aliases, sequences, indexes, index keys, foreign keys, relationships, schemas, user-defined functions, stored procedures, triggers, and so on.
- Programs: Plans, packages, DBRMs, and Java/JAR information
- XML: XML Schema Repository tables
- Security: Database privileges, plan privileges, schema privileges, system privileges, table privileges, view privileges, use privileges, trusted contexts, roles, and audit policies
- Utility: Image copy data sets, REORG executions, LOAD executions, and object organization efficiency information
- Communication: How DB2 subsystems are connected for communication, data distribution, and DRDA usage
- Performance: Statistics, profiles, queries, and auto alerts
- Environmental: Control and administrative information (such as details on image copies and the dummy tables)
How does the DB2 Catalog
support data about these areas? For the most part, the tables of the DB2
Catalog cannot be modified using standard SQL data manipulation language
statements. You do not use INSERT statements, DELETE statements, or UPDATE statements (with a few exceptions) to modify these tables.
Instead, the DB2 Catalog operates as a semi-active, integrated, and non-subvertible
data dictionary. The definitions of these three adjectives follow.
First, the DB2 Catalog is
semi-active. An active dictionary is built, maintained, and used as the result
of the creation of the objects defined to the dictionary. In other words, as the
user is utilizing the intrinsic functions of the DBMS, metadata is being
accumulated and populated in the active data dictionary.
The DB2 Catalog, therefore, is
active in the sense that when standard DB2 SQL is issued, the DB2 Catalog is
either updated or accessed. All the information in the DB2 Catalog, however, is
not completely up-to-date, and some of the tables must be proactively populated
(such as SYSIBM.IPNAMES and SYSIBM.IPLIST). But, for the most part, the DB2 Catalog operates as an active data
dictionary, particularly with regard to SQL. Remember that the three types of SQL are DDL, DCL, and DML. When
DDL is issued to create DB2 objects such as databases, table spaces, and
tables, the pertinent descriptive information is automatically stored in the DB2 Catalog.
When a CREATE, DROP, or ALTER statement is issued,
information is recorded or updated in the DB2 Catalog. For example, upon
successfully issuing a CREATE TABLE statement, DB2 populates the metadata for the table into SYSTABLES and SYSCOLUMNS, as well as possibly into SYSSEQUENCES, SYSFIELDS, SYSCHECKS, and SYSCHECKDEP depending upon the exact DDL
that was issued.
The same is true for security
SQL data control language statements. The GRANT and REVOKE statements cause information to be added or removed from DB2
Catalog tables. For example, if you issue GRANT
TABLE, DB2 potentially adds metadata to SYSTABAUTH and SYSCOLAUTH.
Data manipulation language SQL
(SELECT, INSERT, UPDATE, MERGE, DELETE) statements use the DB2
Catalog to ensure that the statements accurately reference the DB2 objects being
manipulated (such as column names and data types).
Why then is the DB2 Catalog
classified as only semi-active rather than completely active? The DB2 Catalog
houses important information about the physical organization of DB2 objects.
For example, the following information is maintained in the DB2 Catalog:
- The number of rows in a given DB2 table or a given DB2 table space
- The number of distinct values in a given DB2 index
- The physical order of the rows in the table for a set of keys
This information is populated
by means of the DB2 RUNSTATS utility. A truly active data dictionary would update this
information as data is populated in the application table spaces, tables, and
indexes. Some of these statistics are now actively populated in the Real Time
Statistics table in the DB2 Catalog, making them active. But because some of
the information in the DB2 Catalog is not always completely up-to-date, it is
only a semi-active system catalog.
I also decsribed the DB2 Catalog as being integrated. The DB2 Catalog and the DB2 DBMS are inherently
bound together, neither having purpose or function without the other. The DB2
Catalog without DB2 defines nothing; DB2 without the DB2 Catalog has nothing
defined that it can operate on.
The final adjective used to
classify the DB2 Catalog is non-subvertible. This simply means that the DB2
Catalog is continually updated as DB2 is being used; the most important metadata
in the DB2 Catalog cannot be updated behind DB2’s back. Suppose that you
created a table with 20 columns. You cannot subsequently update the DB2 Catalog
to indicate that the table has 15 columns instead of 20 without using standard
DB2 data definition language SQL statements to drop and re-create the table.
An
Exception to the Rule
As with most things in life,
there are exceptions to the basic rule that the SQL data manipulation language
cannot be used to modify DB2 Catalog tables. You can modify columns (used by
the DB2 optimizer) that pertain to the physical organization of table data.
Querying the DB2 Catalog
Because the DB2 Catalog consists of DB2 tables, you can write SQL queries to easily retrieve the metadata information about your DB2 environment. You can write queries to discover all sorts of interesting and useful information about DB2 across the following broad categories:
- Navigational queries, which
help you to maneuver through the sea of DB2 objects in your DB2 subsystems
- Physical analysis queries, which depict the physical state of your
application table spaces and indexes
- Queries that aid programmers (and other analysts) in identifying the
components of DB2 packages and plans
- Application efficiency queries, which combine DB2 Catalog statistics
with the PLAN_TABLE output from EXPLAIN to identify problem queries quickly
- Authorization queries, which identify the authority implemented for
each type of DB2 security
- Historical queries, which use the DB2 Catalog HIST tables to identify and
monitor changing data patterns
- Partition statistics queries, which aid the analysis of partitioned table spaces
Summary
The DB2 Catalog is a rich source of information about your DB2 subsystem and
applications. Be sure to use it to simplify your DB2 development
and administrative efforts.
Note: This blog post was adapted from material in the sixth and latest edition of Craig's book, DB2 Developer's Guide.
No comments:
Post a Comment