Today's blog post is just a quick one to remind everybody about the 2014 IDUG DB2 Tech Conference coming up next week in Phoenix, Arizona.
As usual, there will be a plethora of useful information shared at the conference - as well as the opportunity to meet with developers, gold consultants and DBAs/developers from all over the country... and even the world!
There are complimentary workshop sessions scheduled for Friday on a variety of topics and you will probably also want to sign up for one of the educational seminars on Monday. And don't forget to take advantage of the free IBM certification exams offered up at every IDUG conference.
This year's keynote presentation will be delivered by noted Gartner analyst Donald Feinberg, who is responsible for Gartner's research on database management systems, data warehousing infrastructure and Big Data.
And just check out the impressive agenda of DB2 presentations that will be available at this year's IDUG!
Add to all of the above an array of Special Interest Groups, the DB2 z/OS and LUW panel sessions (where you can ask questions of the developers), and let's not forget the vendor exhibition where you can check out the latest products and services for DB2... and it all adds up to IDUG being the place to be next week if you have anything to do with DB2. I know I'll be there...
...and I hope to see you there, too.
Friday, May 09, 2014
Monday, May 05, 2014
Using Check Constraints to Simulate Domains
Check constraints are a very useful, though somewhat underutilized feature of DB2. Check constraints enable enhanced data integrity without
requiring procedural logic (such as in stored procedures and triggers). Let’s
examine the basics of table check constraints.
A constraint is basically a restriction placed upon
the data values that can be stored in a column or columns of a table. Of
course, most RDBMS products provide several different types of constraints,
such as referential constraints (to define primary and foreign keys) and unique
constraints (to prohibit duplicates).
Check
constraints place specific data value restrictions on the contents of a column
through the specification of a Boolean expression. The expression is explicitly
defined in the table DDL and is formulated in much the same way that SQL WHERE
clauses are formulated. Any attempt to modify the column data (i.e. during
INSERT and UPDATE processing) will cause the expression to be evaluated. If the
modification conforms to the Boolean expression, the modification is permitted
to continue. If not, the statement will
fail with a constraint violation.
This
functionality is great for simulating the relational concept of a domain. A
domain is basically the set of valid values that a column or data type can take
on. Check constraints only simulate domains, though, because there are other
features provided by domains that are not provided by check constraints. One
such feature is that columns pooled from separate domains must not be compared
or operated on by expressions that require the same type of data for both
operands. For domains to truly be supported the DBMS should support both check
constraints and user-defined data types with strong type checking. This
prohibits allowing ridiculous operations, such as comparing IQ to shoe size or
adding Australian dollars to Euros.
Forming Check Constraints
Check
constraints are written using recognizable SQL syntax. This makes them easy to
implement for anyone who has even a passing familiarity with SQL. The check
constraint consists of two components: a constraint name and a check condition.
The
constraint name is an SQL identifier and is used to reference or identify the
constraint. The same constraint name cannot be specified more than once for the
same table. If a constraint name is not explicitly coded, DB2 will create a
unique name automatically for the constraint.
The
check condition defines the actual constraint logic. The check condition can be defined using any
of the basic predicates (>, <, =, <>, <=, >=), as well as
BETWEEN, IN, LIKE, and NULL.
Furthermore, AND and OR can be used to string conditions together.
There
are, however, restrictions on how check constraints are formulated. Some of
these restrictions include:
·
Limitations on the entire length of the check condition.
·
Other tables may not be accessed in the check condition.
·
Only a limited subset of SQL operations are permitted (for example
subselects and column functions are prohibited in a check constraint).
·
One of the operands (usually the first) of the check constraint must be
the name of a column contained in the table for which the constraint is
defined.
·
The other operand (usually the second) must be either another column
name in the same table or a constant value.
·
If the second operand is a constant, it must be compatible with the
data type of the first operand. If the
second operand is a column, it must be the same data type as the first column
specified.
Check Constraint Examples
Check
constraints enable the DBA or database designer to specify more robust data
integrity rules directly into the database.
Consider the following example:
CREATE TABLE EMP
(EMPNO INTEGER
CONSTRAINT CHECK_EMPNO
CHECK (EMPNO BETWEEN 100 and
25000),
EMP_ADDRESS VARCHAR(70),
EMP_TYPE CHAR(8)
CHECK (EMP_TYPE IN (‘TEMP’, ‘FULLTIME’, ‘CONTRACT’)),
EMP_DEPT CHAR(3) NOT NULL WITH DEFAULT,
SALARY DECIMAL(7,2) NOT NULL
CONSTRAINT CHECK_SALARY
CHECK (SALARY < 50000.00),
COMMISSION DECIMAL(7,2),
BONUS DECIMAL(7,2)
);
The
CREATE statement for the EMP table contains three different check constraints:
1. The name of the first check
constraint for the EMP table is CHECK_EMPNO.
It is defined on the EMPNO column.
The constraint ensures that the EMPNO column can contain values that
range from 100 to 25000 (instead of the domain of all valid integers).
2. The second check constraint
for this table is on the EMP_TYPE column.
This is an example of an unnamed constraint. Though this is possible, it is not
recommended. It is best to always
provide an explicit constraint name in order to ease identification and
administration. This specific constraint
restricts the values that can be placed into EMP_TYPE as: 'TEMP', 'FULLTIME',
and 'CONTRACT'; no other values would be
accepted.
3.
The last check constraint on this table is named CHECK_SALARY. It effectively ensures that no employee can
be entered with a salary of more than $50,000. (Now who would want to work
there?)
Column vs. Table Level
Constraints
The
first check constraint example we reviewed showed a column-level check
constraint. However, check constraints also may be coded at the table-level. A
column-level check constraint is defined in the DDL immediately after the
column. Appropriately enough, a table-level check constraint is defined after
all of the columns of the table have already been defined.
It
is quite common for business rules to require access to multiple columns within
a single table. When this situation
occurs, it is wise to code the business rule into a check constraint at the
table-level, instead of at the column level.
Of course, any column-level check constraint can also be defined at the
table-level, as well. In terms of
functionality, there is no difference between an integrity constraint defined
at the table-level and the same constraint defined at the column-level.
Let’s
augment our sample table DDL to add two table-level check constraints:
CREATE TABLE EMP
(EMPNO INTEGER
CONSTRAINT CHECK_EMPNO
CHECK (EMPNO BETWEEN 100 and
25000),
EMP_ADDRESS VARCHAR(70),
EMP_TYPE CHAR(8)
CHECK (EMP_TYPE IN (‘TEMP’, ‘FULLTIME’, ‘CONTRACT’)),
EMP_DEPT CHAR(3) NOT NULL WITH DEFAULT,
SALARY DECIMAL(7,2) NOT NULL
CONSTRAINT CHECK_SALARY
CHECK (SALARY < 50000.00),
COMMISSION DECIMAL(7,2),
BONUS DECIMAL(7,2),
CONSTRAINT COMM_VS_SALARY
CHECK (SALARY > COMMISSION),
CONSTRAINT COMM_BONUS
CHECK (COMMISSION>0 OR BONUS
> 0),
);
The
CREATE statement for the EMP table has been modified to contain two table-level
check constraints having the following ramifications:
1. The name of the first
table-level check constraint for the EMP table is COMM_VS_SALARY. This constraint will ensure that no employee
can earn more commission than salary.
2. The second table-level check
constraint is named COMM_BONUS. This
constraint will ensure that every employee either earns a commission or a bonus
(or possibly, both).
Check Constraint Benefits
So
what are the benefits of check constraints? The primary benefit is the ability
to enforce business rules directly in each database without requiring
additional application logic. Once defined, the business rule is physically
implemented and cannot be bypassed. Check constraints also provide the
following benefits:
·
Because there is no additional programming required, DBAs can implement
check constraints without involving the application programming staff. This
effectively minimizes the amount of code that must be written by the
programming staff. With the significant
application backlog within most organizations, this can be the most crucial
reason to utilize check constraints.
·
Check constraints provide better data integrity. As check constraints are always executed
whenever the data in the column upon which they are defined is to be modified,
the business rule is not bypassed during ad hoc processing and dynamic SQL.
When business rules are enforced using application programming logic instead,
the rules can not be checked during ad hoc processes.
·
Check constraints promote consistency. Because they are implemented
once, in the table DDL, each constraint is always enforced. Constraints written
in application logic, on the other hand, must be executed within each program
that modifies any data to which the constraint applies. This can cause code
duplication and inconsistent maintenance resulting in inaccurate business rule
support.
·
Typically check constraints coded in DDL will outperform the
corresponding application code.
The
overall impact of check constraints will be to increase application development
productivity while at the same time promoting higher data integrity.
Check Constraints, NULLs,
and Defaults
An
additional consideration for check constraints is the relational NULL. Any
nullable column also defined with a check constraint can be set to null. When
the column is set to null, the check constraint evaluates to unknown. Because null indicates the lack of a value,
the presence of a null will not violate the check constraint.
Additionally,
DB2 provides the ability to specify defaults for table columns – both
system-defined defaults (pre-defined and automatically set by the DBMS) and
user-defined defaults. When a row is inserted or loaded into the table and no
value is specified for the column, the column will be set to the value that has
been identified in the column default specification. For example, we could define a default for
the EMP_TYPE column of our sample EMP table as follows:
EMP_TYPE CHAR(8) DEFAULT ‘FULLTIME’
CHECK (EMP_TYPE IN (‘TEMP’, ‘FULLTIME’, ‘CONTRACT’)),
If
a row is inserted without specifying an EMP_TYPE, the column will default to
the value, ‘FULLTIME’.
A
problem can arise when using defaults with check constraints. Most DBMS
products do not perform semantic checking on constraints and defaults. The
DBMS, therefore, will allow the DBA to define defaults that contradict check
constraints. Furthermore, it is possible
to define check constraints that contradict one another. Care must be taken to
avoid creating this type of problem.
Examples
of contradictory constraints are depicted below:
CHECK
(EMPNO > 10 AND EMPNO <9 o:p="">9>
In this case, no value is both greater than 10 and less than 9, so nothing could ever be inserted.
EMP_TYPE CHAR(8) DEFAULT ‘NEW’
CHECK (EMP_TYPE IN (‘TEMP’, ‘FULLTIME’, ‘CONTRACT’)),
In this case, the default value is not one of the permitted EMP_TYPE values according to the defined constraint. No defaults would ever be inserted.
CHECK (EMPNO > 10)
CHECK (EMPNO >= 11)
In this case, the constraints are redundant. No logical harm is done, but both constraints will be checked, thereby impacting the performance of applications that modify the table in which the constraints exist.
Other
potential semantic problems could occur if the constraints contradicts a
referential integrity DELETE or UPDATE rule, if two constraints are defined on
the same column with contradictory conditions, or if the constraint requires
that the column be NULL, but the column is defined as NOT NULL.
Other Potential Hazards
Take
care when using the LOAD utility on a table with check constraints defined to
it. By specifying the ENFORCE NO parameter you can permit DB2 to load data that
does not conform to the check constraints (as well as the referential
constraints). Although this eases the load process by enabling DB2 to bypass
constraint checking, it will place the table space into a check pending state.
You can run CHECK DATA to clear this state (or force the check pending off by
using START with the FORCE option or the REPAIR utility). If you do not run
CHECK DATA, constraint violations may occur causing dirty data.
Summary
Check
constraints provide a very powerful vehicle for supporting business rules in
the database. They can be used to simulate relational domains. Because check
constraints are non-bypassable, they provide better data integrity than
corresponding logic programmed into the application. It is a wise course of
action to use check constraints in your database designs to support data
integrity, domains, and business rules in all of your relational database
applications.
Monday, April 21, 2014
A Little Bit About LOBs
In today's blog post we will take a brief look at LOBs, or Large OBjects, in DB2. I have been prepping for my webinar later this week, titled Bringing Big Data to DB2 for z/OS with LOBs: Understanding, Using, and Managing DB2 LOBs. Be sure to click on the link for that and join me on April 24, 2014 for that webinar!
But back to the topic du jour... LOBs. Let's start with a bit of history. LOBs were added to
relational products like DB2, back in the 1990s, ostensibly to compete more
capably against the Object-Oriented databases of the time. Remember them? Back then it seemed that everybody thought OO DBMS products would supplant relational technology. Seems somewhat quaint now, doesn't it?
At any rate, the idea was to enable relational tables to house unstructured
data like images and audio and video data. DB2 for z/OS users were slow to
adopt LOBs in their mainframe databases. I think that is due to several
different reasons, not the least of which is that it took several new versions
of DB2 for LOBs to mature into capable, usable things for enterprise
applications. Early implementations of LOBs in DB2 were somewhat kludgy
and difficult to administer and use. But IBM has corrected many of the
deficiencies over time and there are now tools that can help us to effectively
manage DB2 LOBs, too.
So LOBs have been around for awhile now, but it has taken some time for them to gain significant levels of usage "out there!"
Why LOBs, you may ask? Why not just store the data in regular old DB2 data types like VARCHAR or VARGRAPHIC? The basic answer is that DB2's data types were not large enough to hold this amount of data,
because of their limit of 32 KB. Some multimedia data can get VERY large. For example, a high resolution video requires about 3 GB for each hour. And high-def TV video requires 720 GB/hour!
But an additional part of the answer is that you do not always want to access the LOB data with the traditional data. For example, if you are querying employee information you probably don't always want their photo or their entire resume... so the data can be stored in a LOB, external from the rest of the data (so as not to impede performance) but connected to the data (when you need to access it).
There
are three types of LOBs supported by DB2:
1. BLOB
– Binary Large Object – for binary data
2. CLOB
– Character Large Object – for text data
3. DBCLOB
– Double Byte Character Large Object – for graphic character data
Furthermore, there are two options for specifying LOBs in DB2
tables:
- The traditional way to define LOBs where the LOB data is stored separately from the rest of the data. This requires defining additional database objects.
- A newer method, for smaller LOBs, introduced with DB2 10, called Inline LOBs.
Using the traditional method, LOBs are
defined such that they are stored external to the base table. The column is specified as a BLOB, CLOB or
DBCLOB in the base table, but DB2 stores the actual LOB data in an auxiliary
table in a LOB table space. You must define a LOB table space per LOB per
partition. So, let’s say we are creating a table with 2 LOBs in a table space
with ten (10) partitions. That means we will need to define 20 LOB table spaces
– 10 for the first LOB (one per partition) and 10 for the second LOB (one for
each partition). An auxiliary table is defined for each LOB table space to hold
the LOB data. And an auxiliary index is required on each auxiliary table.
Now
back to the base table. Remember that we have defined the LOB columns in the
base table. But we also need to include a ROWID column in the base table. Only
one ROWID column is needed per base table no matter how many LOBs you have
defined in the table.
Don’t confuse the ROWID with other concepts that may seem to be similar. A
ROWID is not an identity column nor is it a SEQUENCE. The ROWID is used by DB2
behind the scenes to connect the base table row to the auxiliary table row that
contains the LOB data. You generally will not access the ROWID, although it is
possible to do so just like any other column using SQL. With the ROWID you can
perform direct row access. This
is so because the row ID value implicitly contains the location of the row.
Finally,
each LOB column also has a 2 byte version number associated with it. This
is used by DB2 when LOB data is modified. You cannot access the version number.
Now as of
DB2 10 for z/OS, you can create inline LOBs in your DB2 tables. But what is an
inline LOB? Well, at a high level, the name is somewhat self-explanatory. The
LOB data for inline LOBs is stored with the rest of the base table.
But
it is really a bit more nuanced that that. You can store part of the LOB inline
with the rest of the data and the rest of the LOB externally, if you so choose.
So an inline LOB can have all of the LOB data stored inline with the rest of
the data (if it is small enough) or it can store only a portion with the rest
of the data. This can be a great technique to use if you have some applications
that require perhaps only the first 500 bytes of a larger LOB. Those 500 can be
stored inline – with the rest of the data – while the rest is stored externally
and accessed only when needed. You can create an inline LOB by
specifying the INLINE LENGTH clause on your CREATE TABLE statement.
Inline LOBs can improve performance
depending upon the type of access required. Consider the example where RESUME data is stored
in a CLOB. The first 200 bytes of the resume are accessed most of the time,
with the entire resume being accessed only during rare instances (e.g.
interview, performance reviews, etc.) By storing those first 200 bytes inline
with the rest of the data we can eliminate the I/O to the LOB in the auxiliary
table, thereby improving performance. External LOB data is not buffered,
but inline LOB data is – this too can impact performance.
This blog entry is getting a little longer than I was expecting, so I am going to cut it off here. We've reviewed what LOBs are, a little but of their history, and discussed a but about both traditional and in-line LOBs. To learn more, be sure to join me for the webinar on April 24th (or watch it later - it will be recorded).
Monday, April 14, 2014
Aggregating Aggregates Using Nested Table Expressions
Sometimes when you are writing your SQL to access data you come across the need to work with aggregates. Fortunately, SQL offers many simple ways of aggregating data. But what happens when you uncover then need to perform aggregations of aggregates?
What does that mean? Well, consider an example. Let's assume that you want to compute
the average of a sum. This is a reasonably common requirement that comes up frequently in applications that are built
around sales amounts. Let's say that we have a table containing sales
information, where each sales amount has additional information indicating the
salesman, region, district, product, date, etc.
A common requirement is to produce a report of the average
sales by region for a particular period, say the first quarter of 2014. But the
data in the table is at a detail level, meaning we have a row for each specific
sale.
A novice SQL coder might try to write a query with a
function inside a function, like AVG(SUM(SALE_AMT)). Of course, this is invalid
SQL syntax. DB2 will not permit the nesting of aggregate functions. But we can
use nested table expressions and our knowledge of SQL functions to build the
correct query.
Let’s start by creating a query to return the sum of all
sales by region for the time period in question. That query should look
something like this:
SELECT REGION,
SUM(SALE_AMT)
FROM SALES
WHERE SALE_DATE BETWEEN
DATE(‘2014-01-01’)
AND DATE(‘2014-03-31’)
GROUP BY REGION;
Now that we have the total sales by region for the period in
question, we can embed this query into a nested table expression in another
query like so:
SELECT NTE.REGION,
AVG(NTE.TOTAL_SALES)
FROM (SELECT REGION,
SUM(SALE_AMT)
FROM
SALES
WHERE SALE_DATE BETWEEN DATE(‘2014-01-01’)
AND DATE(‘2014-03-31’)
GROUP BY REGION) AS NTE
AND DATE(‘2014-03-31’)
GROUP BY REGION) AS NTE
GROUP BY NTE.REGION;
And voila! We have aggregated an aggregate, (averaged a sum)...
Sunday, April 06, 2014
DB2 Buffer Pool Monitoring
After setting up your buffer pools, you will want to
regularly monitor your configuration for performance. The most rudimentary way
of doing this is using the -DISPLAY BUFFERPOOL command. There are many options
of the DISPLAY command that can be used to show different characteristics of
your buffer pool environment; the simplest is the summary report, requested as
follows:
-DISPLAY BUFFERPOOL(BP0)
LIST(*) DBNAME(DSN8*)
And a truncated version of the results will look something
like this:
DSNB401I - BUFFERPOOL NAME BP0,
BUFFERPOOL ID 0, USE COUNT 20
DSNB402I - VIRTUAL BUFFERPOOL
SIZE = 500 BUFFERS 736
ALLOCATED = 500 TO BE DELETED = 0
IN-USE/UPDATED = 0
DSNB404I - THRESHOLDS - 739
VP SEQUENTIAL = 80
HP SEQUENTIAL = 75
DEFERRED WRITE = 85
VERTICAL DEFERRED WRT = 80,0
PARALLEL SEQUENTIAL = 50
ASSISTING PARALLEL SEQT = 0
Of course, you can request much more information to be
displayed using the DISPLAY BUFFERPOOL command by using the DETAIL parameter.
Additionally, you can request that DB2 return either incremental statistics
(since the last DISPLAY) or cumulative statistics (since DB2 was started). The
statistics in a detail report are grouped in the following categories: GETPAGE
information, Sequential Prefetch information, List Prefetch information,
Dynamic Prefetch information, Page Update statistics, Page Write statistics,
and Parallel Processing Activity details.
A lot of interesting and useful details can be gathered
simply using the DISPLAY BUFFERPOOL command. For example, you can review
GETPAGE requests for random and sequential activity, number of prefetch
requests (whether static or dynamic, or for sequential or list prefetch),
number of times each of the thresholds were tripped, and much more. Refer
to the DB2 Command Reference manual (SC19-4054-02 for DB2 11) for a definition of each of the actual
statistics returned by DISPLAY BUFFERPOOL.
Many organizations also have a performance monitor (such as IBM’s Omegamon) that simplifies the gathering and display
of buffer pool statistics. Such tools are highly recommended for in-depth
buffer pool monitoring and tuning. More sophisticated tools also exist that
offer guidance on how to tune your buffer pools — or that automatically adjust
your buffer pool parameters according to your workload. Most monitors also
provide more in-depth statistics, such as buffer pool hit ratio calculations.
The buffer pool hit ratio is an important tool for
monitoring and tuning your DB2 buffer pools. It is calculated as follows:
Hit ratio = GETPAGES - pages_read_from_disk / GETPAGEs
“Pages read from disk” is a calculated field that is the sum
of all random and sequential reads.
The highest possible buffer pool hit ratio is 1.0. This
value is achieved when each requested page is always in the buffer pool. When
requested pages are not in the buffer pool, the hit ratio will be lower. You
can have a negative hit ratio — this just means that prefetch was requested to
bring pages into the buffer pool that were never actually referenced.
In general, the higher the hit ratio the better because it
indicates that pages are being referenced from memory in the buffer pool more
often. Of course, a low hit ratio is not always bad. The larger the amount of
data that must be accessed by the application, the lower the hit ratio will
tend to be. Hit ratios should be monitored in the context of the applications
assigned to the buffer pool and should be compared against hit ratios from
prior processing periods. Fluctuation can indicate problems.
Sunday, March 30, 2014
DB2 Buffer Pool Sizing
DB2 very efficiently manages data in large buffer pools. To
search for data in a large buffer pool does not consume any more resources than
searching in a smaller buffer pool. With this in mind, do not skimp on
allocating memory to your DB2 buffer pools.
So, just how big should you make each buffer pool? One rule of thumb is to make the buffer pool
large enough to hold five minutes worth of randomly read pages during peak
processing time. To clarify, the goal is that once a page has been read from
disk into the buffer pool, the goal is to maintain it in memory for at least
five minutes. The idea is that a page, once read, is likely to be needed again
by another process during peak periods. Of course, your particular environment
may differ.
If you have metrics on how frequently a piece of data once read, will need to be read again, as well as how soon it will need to be read, you can use that to tinker with your buffer pool size and parameters to optimize buffer pool data residency.
But it can be difficult to know how much data is read at
peak processing time, as well as even knowing when peak processing time will
be, before an application is coded. You can gather estimates from the subject
matter experts, end users, and application designers, but it will just be
estimates. Over time you will have to
examine buffer pool usage and perhaps re-size your buffer pools, or move
objects from one buffer pool to another to improve performance.
Of course, you will want to make sure that you have not
allocated so much memory to DB2 buffer pools that the system starts paging. When
there is not enough real storage to back the buffer pool storage, paging
activity will cause performance to degrade.
What is paging? Paging occurs when the virtual storage
requirements for a buffer pool exceeds the real storage capacity for the z/OS
image. When this happens DB2 migrates the least recently used pages in the
buffer pool to auxiliary storage. If the data that was migrated is then
accessed those pages must be brought back into real storage from auxiliary
storage. When you detect that DB2 is paging you should either increase the
amount of real storage or decrease the size of your buffer pools.
Only DBAs or systems programmers should be allowed to add or
modify the size of DB2 buffer pools. And
these qualified professionals should be able to analyze the system to know the
amount of memory available (real and virtual), as well as the amount being used by other system
software and applications.
For a good overview of mainframe virtual storage consult the following link from IBM.
Friday, March 21, 2014
DB2 Tool Requirements
The last blog post here at the DB2 Portal offered up a brief overview of the types of tools that you might want to consider to help you use, manage, and administer your DB2 applications and databases. But it did not really look into the capabilities and requirements for modern DB2 tools and solutions.
Today’s DB2 management and administration tools should provide intelligent automation to reduce the problems inherent in the tedious day-to-day tasks of database administration. Simple automation is no longer sufficient. Modern data management software must be able to intelligently monitor, analyze, and optimize applications using past, present, and future analysis of collected data. Simply stated, the software should work the way a consultant works--fulfilling the role of a trusted advisor. The end result should be software that functions like a consultant, enabling your precious human resources to spend time on research, strategy, planning, and implementing new and advanced features and technologies, instead of rote day-to-day tasks.
Furthermore, modern database tools should provide cross-platform, heterogeneous management. For most medium-to-large IT organization it is not enough to manage just DB2 for z/OS systems, for example. The ability to offer administrative and development assistance across multiple DBMS platforms (for example, DB2 for LUW, Oracle, SQL Server, MySQL, and so on). Most companies have multiple DBMSs that need to be managed -- not just one... and DBAs and developers are precious resources that increasingly are being asked to work on more than just a single DBMS. When the tools can manage cross-platform, the learning curve is reduced and productivity can be enhanced.
And while it is true that today’s DBMS products are becoming more self-managing, they do not yet provide out-of-the-box, lights-out operation, nor do they offer all of the speed, usability, and ease of use features of ISV admin, management, and development tools. An organization looking to provide 24/7 data availability coupled with efficient performance will have to augment the capabilities of their DBMS software with data management and DBA tools to get the job done.
As data management tasks get more complex and DBAs become harder to find and retain, more and more database maintenance duties should be automated using intelligent management software. Using intelligent, automated DB2 tools will help to reduce the amount of time, effort, and human error associated with implementing and managing efficient database applications.
Monday, March 17, 2014
Types of DB2 Tools
As a user of DB2, which I'm guessing you are since you are reading this blog, you should always be on the lookout for useful tools
that will help you achieve business value from your investment
in DB2. There are several categories of tools that can help you to achieve this
value.
Database Administration and Change Management tools simplify
and automate tasks such as creating database objects, examining existing
structures, loading and unloading data, and making changes to databases.
Without an administration tool these tasks require intricate, complex scripts
to be developed and run. One of the most important administration tools is the
database change manager. Without a robust, time-tested product that is designed
to effect database changes, database changes can be quite time-consuming and
error prone. A database change manager automates the creation and execution of
scripts designed to implement required changes – and will ensure that data
integrity is not lost.
One of the more important categories of DB2 tools offers Performance
Management capabilities. Performance tools help to gauge the
responsiveness and efficiency of SQL queries, database structures, and system
parameters. Performance management tools should be able to examine and improve
each of the three components of a database application: the DB2 subsystem, the
database structures, and the application programs. Advanced performance tools
can take proactive measures to correct problems as they happen.
Backup and Recovery tools simplify the process of creating
backups and recovering from those backup copies. By automating complex
processes, simulating recovery, and implementing disaster recovery procedures
these tools can be used to assure business resiliency, with no data being lost
when the inevitable problems arise.
Another important category of DB2 tool is Utilities
and Utility Management. A utility is a single purpose tool for moving
and/or verifying database pages; examples include LOAD, UNLOAD, REORG, CHECK,
COPY, and RECOVER. Tools that implement and optimize utility processing, as
well as those that automate and standardize the execution of DB2 utilities, can
greatly improve the availability of your DB2 applications. You might also want to consider augmenting your utilities with a database archiving solution that moves data back and forth between your database and offline storage.
Governance and Compliance tools deliver the ability to protect
your data and to assure compliance with industry and governmental regulations,
such as HIPAA, Sarbanes-Oxley, and PCI DSS. In many cases business executives have to vouch for the
accuracy of their company’s data and that the
proper controls are in place to comply with required regulations. Governance
and compliance tools can answer questions like “who did what to which data
when?” that are nearly impossible to otherwise answer.
And finally, Application Management tools help
developers improve application performance and speed time-to-market. Such tools
can improve database and program design, facilitate application testing including
the creation and management of test data, and streamline application data management
efforts.
Tools from each of these categories can go a long way toward helping your organization excel at managing and accessing data in your DB2 databases and applications...
Tools from each of these categories can go a long way toward helping your organization excel at managing and accessing data in your DB2 databases and applications...
Thursday, March 06, 2014
What Makes DB2 Tick?
Conceptually, DB2 is a relational database management system. Actually,
some might object to this term instead calling DB2 a SQL DBMS because it does
not conform exactly to Codd’s relational model. Physically, DB2 is an amalgamation of address spaces and
intersystem communication links that, when adequately tied together, provide
the services of a database management system.
"What does all this information have to do with me?" you
might wonder. Well, understanding the components of a piece of software helps you use
that software more effectively. By understanding the physical layout of DB2,
you can arrive at system solutions more quickly and develop SQL that performs
better.
This blog entry will not get very technical and won't delve into the bits and bytes of DB2. Instead, it presents the basic
architecture of a DB2 subsystem and information about each subcomponent of that
architecture.
Each DB2 subcomponent is comprised of smaller units called CSECTs.
A CSECT performs a single logical function. Working together, a bunch of CSECTs
provide general, high level functionality for a subcomponent of DB2. DB2 CSECT
names begin with the characters DSN.
There are three major subcomponents of DB2:
- System services (SSAS)
- Database services (DBAS)
- Distributed Data Facility services (DDF).
The SSAS, or System Services Address Space, coordinates the
attachment of DB2 to other subsystems (CICS, IMS/TM, or TSO). SSAS is also
responsible for all logging activities (physical logging, log archival, and
BSDS). DSNMSTR is the default name for this address space. (The address spaces
may have been renamed at your shop.) DSNMSTR is the started task that contains
the DB2 log. The log should be monitored regularly for messages indicating the
errors or problems with DB2. Products are available that monitor the log for
problems and trigger an event to contact the DBA or systems programmer when a
problem is found.
The DBAS, or Database Services Address Space, provides the facility
for the manipulation of DB2 data structures. The default name for this address
space is DSNDBM1. This component of DB2 is responsible for the execution of SQL
and the management of buffers, and it contains the core logic of the DBMS.
Database services use system services and z/OS to handle the actual databases
(tables, indexes, etc.) under the control of DB2. Although DBAS and SSAS
operate in different address spaces, they are interdependent and work together
as a formal subsystem of z/OS.
The DBAS can be further broken down into three components, each of
which performs specific data-related tasks:
- Relational Data System (RDS),
- Data Manager (DM)
- Buffer Manager (BM).
The Buffer Manager handles the movement of data from disk to
memory; the Data Manager handles the application of Stage 1 predicates and
row-level operations on DB2 data; and the Relational Data System, or Relational
Data Services, handles the application of Stage 2 predicates and set-level
operations on DB2 data.
Figure 1. The
components of the Database Services Address Space.
The next DB2 address space, DDF, or Distributed Data Facility services, is optional. DDF is required only when you
want distributed database functionality. If your shop must enable remote DB2
subsystems to query data between one another, the DDF address space must be
activated. DDF services use VTAM or TCP/IP to establish connections and
communicate with other DB2 subsystems using either DRDA or private protocols.
DB2 also requires an additional address space to handle locking.
The IRLM, or Intersystem Resource Lock Manager, is responsible for the
management of all DB2 locks (including deadlock detection). The default name of
this address space is IRLMPROC.
Finally, DB2 uses additional address spaces to manage the execution
of stored procedures and user-defined functions. In older releases of DB2 (V4 and V5 era) these address spaces are known
as the Stored Procedure Address Spaces, or SPAS. For current DB2 releases (V8 and later), however, the z/OS Workload Manager (WLM) is used and can define multiple address spaces for
stored procedures.
So, at a high level, DB2 uses five address spaces to handle all DB2
functionality. DB2 also communicates with allied agents, like CICS, IMS/TM, and
TSO. And database services uses the VSAM Media Manager to actually read data. A
summary of the DB2 address spaces and the functionality they perform is
provided in Figure 2.
Figure 2. The DB2
address spaces.
Monday, March 03, 2014
Time to Start Planning for This Year's IDUG DB2 Tech Conference
Well, here we are in March... the cold and snow will hopefully soon be behind us... and thoughts of Spring and warm weather start to fill our minds. And that can only mean one thing - the annual North American IDUG DB2 Tech Conference will soon be upon us! So if you haven't started to plan on how to get funding to attend it is time to start!
The 2014 North American IDUG DB2 Tech Conference will be held in
Phoenix, Arizona the week of May 12 thru May 16… and if you
are a DB2 professional (and since you’re reading this I will assume that you
are) then you should be making plans to attend. As it does every year, IDUG
features all of the latest in DB2 technologies, networking opportunities and
the technical content you need to be successful. There
are over 100 technical sessions to choose from at this year’s conference!
The conference also hosts Special Interest Groups (SIGs) on nine different DB2 and data-related topics, so I’m sure you can find
something interesting and helpful to attend. And you can use the conference as a vehicle to get
your DB2 certification! All DB2 Tech Conference attendees have the opportunity
to take a Free IBM Certification Examwhile
in Phoenix! Each attendee may take one exam for free and if they pass they are
eligible for a second free exam.
You can also arrive early and attend a full day session on a DB2
topic of your choice (at an additional cost). Every year IDUG provides several
in-depth Educational Seminars delivered by some of the top DB2 consultants out
there. This year you can sign up to see Bonnie Baker deliver one of her last
DB2 classes before she retires in June 2014 (we’ll miss you Bonnie)!
And don't forget the Vendor Exhibit Hall, which boasts all of the DB2 tools and
services providers that you’d ever want to see – all under one roof in one convenient
place. And the vendors always offer a lot of goodies and giveaways, ranging
from t-shirts and pens to tablets and other nice gadgets.
I'll be presenting this again at IDUG, this year on the topic of Big Data for the DB2 Professional. So be sure to stop in to say "Hi" and chat about DB2, big data, or your favorite topic du jour!
The IDUG DB2 Tech Conference is the place to be to learn all about DB2 from IBMers, gold consultants, IBM champions, end users, and more. With all of this great stuff going on in Phoenix this May, why wouldn't you want to be there!?!?
Subscribe to:
Posts (Atom)