Fifty Two Years Ago This Week
On April 7, 1964, IBM announced "a new generation of electronic computing equipment" called the IBM System/360... or as it, and subsequent generations of the machine have become known, the mainframe!
If you'd like to take a walk down memory lane you can read the original press release on the IBM web site.
The System/360, unquestionably, was one of the most significant products in the history of computing. The general, sound concepts of the System/360 are still the at the foundation of modern mainframe. Sure, a lot has changed, but those changes were made to a solid, fundamentally sound base.
Code that ran on the System/360 can still be run on the latest and greatest modern IBM mainframe, the z13. What other platform can say that? I mean, when my desktop was recently force upgraded from Windows 7 to WIndows 10 some of my software stopped running!
So let's take a moment and salute the mainframe on its 52nd birthday... and wish it many more years of productive use.
Happy birthday, mainframe!
Friday, April 08, 2016
Monday, April 04, 2016
The Most Misunderstood Features of DB2 – Part 7: It Depends!
"It depends" is probably the most famous phrase out there when it comes to DB2 performance. Some call it the cardinal rule. You can answer just about any question that anybody asks at any time with it. Try it out. It depends! Always works... that is, it always works if you are trying to avoid answering the question!
Most DBAs and SQL experts resist
giving a straight or simple answer to a general question because there is no
simple and standard implementation that exists. Every situation is different,
and every organization is unique in some
way. So answering "it depends" to most questions can make a lot of sense. But "it depends" should never be the end of the answser!
Don’t be discouraged when you ask
the local expert which statement will perform better, and the answer is “It
depends.” The expert is just doing his or her job. The secret to optimizing DB2
performance is being able to answer the follow-up question to “It depends”—and
that is “What
does it depend on?”
The key
to effective SQL performance tuning is to document each SQL change along with
the reason for the change. Follow up by monitoring the effectiveness of every
change to your SQL statements before moving them into a production environment.
Over time, trends will emerge that will help to clarify which types of SQL
formulations perform best.
So the misunderstanding in this case is thinking that "it depends" is a complete answer to any type of question. It isn't... it is just the beginning of most answers. But it takes time, experience, and study to be able to answer what it depends upon. And that is why you might not get that answer unless you press for it... and attempt to find the answer yourself without just leaning on others for the answer all the time.
There is a corollary to the “It depends”
rule that also is important. Hard and fast rules that rigidly enforce or forbid
usage are not a good idea. In other words, this corollary is simply stated:
“Almost never say always or never.” Notice that even the wording of this
corollary embraces flexibility.
So be flexible, but embrace answers that help... because "it depends" helps nobody unless you tell them what it depends upon!
Thursday, March 17, 2016
Digital Transformation and DB2 for z/OS: It’s Not Your Daddy’s DB2!
If you are a DBA who
has been using DB2 for z/OS for a while you should have noticed that we are not
doing things the same way we used to. DB2 is changing and we should be changing
with it. If you are still using DB2 the same way you did 10 or 20 years ago,
then you are definitely not adhering to industry best practices!
The same trends that
are driving the digital explosion are also changing DB2 and the traditional
role of the DBA. We are storing more data and different types of data for
longer periods of time and in different ways than we have in the past.
And DB2 for z/OS keeps
changing to adopt and embrace modern data management requirements and
techniques. Whether it is modernizing storage with universal table spaces,
embracing unstructured data in LOBs, or expanding the SQL language with new and
more functionality, today’s DB2 looks a lot different than it did yesterday.
Indeed, it is different – it is not your daddy’s DB2.
I’ve been writing a
series of blog posts for BMC about this topic under the title It’s Not Your
Daddy’s DB2! You can find the first
three blog posts in this series here: 1 2 3
But you can also
attend a live webinar that BMC is sponsoring where I will talk about these
issues. You can learn about:
·
Trends that influence
the size and complexity of your DB2 environment and how this impacts data
management
·
How to adapt to new
DB2 data types and structures
·
Best practices and
technologies for managing DB2 in the digital age
·
And BMC will share its
next generation technology for managing the new world of DB2 for z/OS.
Learn how digital
transformation will change the way your DBAs manage critical business needs.
Attend this webinar on March 30, 2016, at 12:00 pm CT.
Tuesday, March 15, 2016
The Most Misunderstood Features of DB2 – Part 6: Not Indexing
Welcome to our
on-going series of misunderstood issues in the world of DB2 for z/OS. Today’s
topic is indexing, or to be more precise: not indexing. What do I mean by that?
Well, I’ve heard folks
say that you should always create at least one index on every DB2 table. And
while I can empathize with that general notion, I can’t agree completely
because of that word “always,” which almost always makes a sentence wrong.
Sure, there are
several good arguments for creating indexes on DB2 tables. Perhaps the most
important one is clustering. DB2 uses an index to control how data is clustered
on disk. Only one index can be specified as the clustering index (because, of
course, the data on disk can only be stored in one order, right!). If you do
not specify a clustering index then DB2 will use the earliest created (oldest)
index to cluster the data. So it is usually a good idea to create an index to
guide clustering.
Another popular reason
to create an index on a DB2 table is to enforce uniqueness for a UNIQUE
constraint or PRIMARY KEY. The only way to enforce uniqueness on a DB2 column
(or set of columns) is by using a unique index.
Of course, there are a
lot of other good reasons to create indexes, most of them to improve query
performance. But I contend that there are situations when it makes sense not to
create any indexes at all.
So when does it make more sense not to build an index for a DB2 table?
Let's
start by saying that most of the time you will want to build at least one - and
probably multiple - indexes on each table that you create. Indexes are crucial
for optimizing performance of SQL access. Without an index, queries must scan
every row of the table to come up with a result. And that can be very slow.
Having
said that, here are a few situations some times where it can make sense to have
no indexes defined on a table:
When all (or most) accesses retrieve every row
of the table. Because every row will be retrieved every time you want to use
the table, an index (if used) would just add extra I/O and would diminish, not
enhance performance. Though not extremely common, you may indeed come across
such tables in your organization.
For a very small table with only a few pages of data and no
primary key or uniqueness requirements. A very small table (perhaps 20 to 30 or
so pages) might not need an index because simply reading all of the pages is
very efficient already.
When performance doesn't matter and the table is only accessed
very infrequently. But, hey, when do you ever
have that type of requirement in the real world?
Other
than for these circumstances, you will most likely want to build one or more
indexes on each table, not only to optimize performance, but also to ensure
uniqueness, to support referential integrity, and to drive data clustering.
Of
course, indexes do not come without cost. Indexes take up disk space and adding
a lot of indexes will consume disk space. An additional consideration is their
impact in data modification. Although indexes speed up queries they degrade
inserts and deletes, as well as any modification to indexed columns.
What do you think? Are there other situations where a table should have no indexes? Are there any pertinent high-level issues I missed? Feel free to add your thoughts and comments below!
What do you think? Are there other situations where a table should have no indexes? Are there any pertinent high-level issues I missed? Feel free to add your thoughts and comments below!
Friday, March 04, 2016
The Most Misunderstood Features of DB2 – Part 5: Choosing the Clustering Key
Hello everybody, and welcome
to part 5 in my on-going series where we take a look at The Most Misunderstood
Features of DB2. You can find the earlier installments in this series here
(Part 1 Part 2 Part 3 Part 4).
Today’s topic, on choosing an appropriate clustering index, might be a bit confusing to some of you. So let’s start at
the beginning and describe what clustering is.
What is Clustering?
In DB2 for z/OS, you use an
index to indicate how DB2 should try to store table space data physically on
disk. This is called clustering. A DB2 index is a clustering index if the CLUSTER keyword is specified when the
index is created. Clustering causes inserted rows to be stored contiguously in
sequence whenever possible. Additionally, when the table space is reorganized
the data will be sequenced according to the clustering index. Since there can
only be one physical sequence for data on disk, there can only be one
clustering index per table.
For tables that do not specify
APPEND YES and do not have hashing
defined, if you do not specify a clustering index, DB2 chooses to cluster the
data using the oldest existing index. Therefore, it is wise to explicitly
specify a clustering index instead of letting DB2 decide because you can almost
always choose better than the (basically random) choice DB2 makes.
Clustering of data is
important because when the data is sequentially accessed, if the data exists on
the same page (or range of pages) then I/O is minimized because fewer pages
are read than if the data were spread throughout the entire table space on
“random” pages. And fewer I/Os means improved performance.
So then, how should a DBA go
about deciding how to cluster the data? Well, the best answer is to analyze the
various ways in which the data is to be accessed. This means reviewing the
actual SQL along with frequency and importance of the execution patterns. When
the most important/frequently executed sequential accesses are identified, then
you can choose to cluster the data according to that information. By following this approach then the most frequent and/or important sequential accesses will be optimized by clustering.
Of course, frequently the
decision on how to cluster is made during the database design phase when few,
if any, SQL statements or access patterns are readily available. So the DBA
sometimes makes a knee-jerk decision to just cluster based on the primary key (PK)
and be done with it. This is usually not a good idea.
Why is Clustering by PK not a very good idea?
Think about what we just said
about sequentially accessing data. How often is data sequentially accessed by PK?
Sometimes this might be the case, maybe if you are producing a report in order
by PK, but even that example is not a very good one. Think about it. If you are
producing a customer report you probably do not create it in CUSTNO order but
probably in order by customer name (or something more interesting to humans
than a key number).
So the best option is to
figure out the most common and important sequential access patterns and cluster
accordingly. But what if this information is not available? My assertion is
that clustering by Foreign Key (FK), instead of by PK, makes a lot more sense.
Usually a FK is on the many side of a one-to-many relationship. Think about
what that means. When you join PK to FK there will be one PK row being joined
to multiple FK rows. If those multiple FK rows are all on the same page (or on
contiguous pages) due to clustering, then we can impact I/O.
The Bottom Line
The bottom line here is that
you should think about (and even re-think) your general assumptions and why you
make any database design decision. Sometimes your immediate gut reaction won’t
be the right one. And it is always better to think with our heads instead of
our guts, right?
Subscribe to:
Posts (Atom)