Monday, December 21, 2009

Happy Holidays


Just a brief posting today to wish everyone a very happy holiday season.

I am taking some down time thru the end of the year to visit family "up North."

So, until next year, may your databases run without a glitch and here's hoping you all have an enjoyable holiday!

Wednesday, December 16, 2009

Quick Thoughts on DB2 Performance

Database performance problems are not caused by magic. Indeed, all performance problems are always caused by change. That statement flies in the face of what I normally say, which is “Almost never say always or never”… but in this case, it is true.

Think about it for a moment. If everything remains stable and unchanging in your environment, then why would performance vary? That’s right, it wouldn’t.

Something tangible must change before a performance problem can be experienced. The challenge of performance tuning is to find the source of the change, gauge its impact, and formulate a solution.

Change can take many forms, including the following:
  • Physical changes to the environment, such as a new CPU, new disk devices, or different tape drives.
  • Changes to system software, such as a new release of a product (for example, WebSphere, CICS, or even z/OS), the alteration of a product (for example, the addition of more or fewer CICS regions or an IMS SYSGEN), or a new product (for example, implementation of DFHSM). Also included is the installation of a new release or version of DB2, which can result in changes in access paths as well as utilization of new features.
  • Changes to the DB2 engine from maintenance releases and PTFs, which can change the optimizer (and sometimes introduce other new functionality).
  • Changes in system capacity. More or fewer jobs could be executing concurrently when the performance problem occurs. Or additional users may be banging away at your transactions.
  • Environmental changes, such as the implementation of client/server programs, the adoption of SOA, or other new technologies.
  • Database changes. This involves changes to any DB2 object, and ranges from adding a new column or an index to dropping and re-creating an object.
  • Changes to the application development methodology, such as usage of check constraints instead of application logic or the use of stored procedures.
  • Changes to application code, both SQL and host language code (COBOL, C, Java, etc.).

Although the majority of your performance problems are likely to be application-oriented, you must be prepared to explore any and all of these other areas when application tuning has little effect.

My advice is to be sure that you institute strict change control tracking across all areas of your IT infrastructure. That way, whenever you experience a performance problem, you will be able to track what has changed recently, along with who changed it and why. This is important because every DBA knows what the answer to the question “What changed?” will be… right?

It is always “nothing!”

And that cannot be true. Oh, it does not mean that the person answering is lying. He or she may not have changed anything. And it is not necessarily reasonable to expect an application developer to know what all could have changed…especially when what can impact DB2 performance spans so many areas of the IT infrastructure.


So do yourself… and your company a favor: be sure that you meticulously track each and every change to any aspect of your systems. Then – and this is where many shops break down – make sure that you have methods of tying all of the change information together in such a way that it can be queried and examined in the face of a performance problem.


Only then can you reasonably expect your DBAs rapidly to be able to track down and remedy DB2 performance problems… because only then will they have the pertinent information at their disposal.

Friday, December 11, 2009

A Short History of DB2 for z/OS – Part 2

Today’s blog entry is a continuation of yesterday’s post in which we began a brief review of the history of DB2 for z/OS. That post covered Versions 1 through 3; so today we pick up our historical review with Version 4.

Version 4 was a very significant milestone in the history of DB2. It was highlighted by the introduction of Type 2 indexes, which removed the need to lock index pages (or subpages, now obsolete). Prior to V4, index locking was a particularly thorny performance problem that vexed many shops. And, of course, I’d be remiss if I did not discuss data sharing, which made its debut in V4. With data sharing, DB2 achieved new heights of scalability and availability unmatched within the realm of DBMS; it afforded users the ability to upgrade without an outage and to add new subsystems to a group “on the fly.” The new capabilities did not stop with there; V4 also introduced stored procedures, CP parallelism, performance improvements, and more. DB2 V4 was, indeed, a major milestone in the history of mainframe DB2.

In June 1997 DB2 Version 5 became generally available. It was the first DB2 version to be referred to as DB2 for OS/390 (previously it was DB2 for MVS). Not as significant as V4, we see the trend of even numbered releases being bigger and more significant than odd numbered releases (of course, this is just my opinion). V5 was touted by IBM as the e-business and BI version. It included Sysplex parallelism, prepared statement caching, reoptimization, online REORG, and conformance to the SQL-92 standard.

Version 6 brings us to 1999 and the introduction of the Universal Database term to the DB2 moniker. The “official” name of the product is now DB2 Universal Database for OS/390. And the Release Guide swelled to over 600 pages! Six categories of improvements were introduced with V6 spanning:
  • Object-relational extensions and active data
  • Network computing
  • Performance and availability
  • Capacity improvements
  • Data sharing enhancements
  • User productivity
The biggest of the new features were SQLJ, inline statistics, triggers, large objects (LOBs), user-defined functions, and distinct types.

Version 6 is also somewhat unique in that there was this “thing” typically referred to as the V6 refresh. It added functionality to DB2 without there being a new release or version. The new functionality in the refresh included SAVEPOINTs, identity columns, declared temporary tables, and performance enhancements (including star join).

March 2001 brings us to DB2 Version 7, another “smaller” version of DB2. Developed and released around the time of the Year 2000 hubbub, it offered much improved utilities and some nice new SQL functionality including scrollable cursors, limited FETCH, and row expressions. Unicode support was also introduced in Db2 V7. For a more detailed overview of V7 (and the V6 refresh) consult An Introduction to DB2 for OS/390 Version 7.

DB2 Version 8 followed, but not immediately. IBM took advantage of Y2K and the general desire of shop’s to avoid change during this period to take its time and deliver the most significant and feature-laden version of DB2 ever. V8 had more new lines of code than DB2 V1R1 had total lines of code!

I don’t want to get bogged down in recent history here outlining the features and functionality of DB2 releases that should be fresh in our memory (V8 and V9). If you really want some details on those refer to these links for them:

An Overview of DB2 for z/OS Version 8


DB2 9 for z/OS Features



Which brings us to today. Most shops should be either running Version 9 in production or planning their migration from V8 to V9. And we are all waiting with baited breath for DB2 X… which hopefully should be announced sometime next year.

Thursday, December 10, 2009

A Short History of DB2 for z/OS – Part 1

Let's go back in time... almost three decades ago... back to the wild and woolly 1980s! And watch our favorite DBMS, DB2, grow up over time.

Version 1 Release 1 was announced on June 7, 1983. And it became generally available on Tuesday, April 2, 1985. I wonder if it was ready on April 1st but not released because of April Fool’s Day? Any old-time IBMer out there care to comment?

Initial DB2 development focused on the basics of making a relational DBMS work. Early releases of DB2 were viewed by many as an “information center” DBMS, not for production work like IMS.

Version 1 Release 2 was announced on February 4, 1986 and was released for general availability a month later on March 7, 1986. Wow! Can you imagine waiting only a month for a new release of DB2 these days? But that is how it happened back then. Same thing for Version 1 Release 3, which was announced on May 19, 1987 and became GA on June 26, 1987. DB2 V1R3 saw the introduction of date data types.

You might notice that IBM delivered “releases” of DB2 back in the 1980s, whereas today (and ever since V3) there have only been versions. Versions are major, whereas releases are not quite as significant as a version.

Version 2 of DB2 became a reality in 1988. Version 2 Release 1 was announced in April 1988 and delivered in September 1988. Here we start to see the gap widening again between announcement and delivery. V2R1 was a very significant release in the history of DB2. Some mark it as the bellwether for when DB2 began to be viewed as a DBMS capable of supporting mission critical, transaction processing workloads. Not only did V2R1 provide many performance enhancements but it also signaled the introduction of declarative Referential Integrity (RI) constraints. RI was important for the acceptance of DB2 because it helps to assure data integrity within the DBMS.

No sooner than V2R1 became GA than IBM announced Version 2 Release 2 on October 4, 1988. But it was not until a year later that it became generally available on September 23, 1988. DB2 V2R2 again bolstered performance in many ways. It also saw the introduction of distributed database support (private protocol) across MVS systems.

Version 2 Release 3 was announced on September 5, 1990 and became generally available on October 25, 1991. Two very significant features were added in V2R3: segmented table spaces and packages. Segmented table spaces have become the de facto standard for most DB2 data and packages made DB2 application programs easier to support. DB2 V2R3 is also the version that beefed up distributed support with Distributed Relational Database Architecture (DRDA). Remote unit of work distribution was not available in the initial GA version, but IBM came out with RUOW support for DB2 V2R3 in March 1992.

And along comes DB2 Version 3 announced in November 1993 and GA in December 1993. Now it may look like things sped up again here, but not really. This is when the QPP program for early support of DB2 started. QPP was announced in March 1993 and delivered to customers in June 1993. Still though, fairly rapid turnaround by today’s standards, right?

V3 greatly expanded the number of bufferpool options available (from 5 pools to 80). There were many advances made in DB2 V3 to take better advantage of the System 390 environment: V3 introduced support for hardware assisted compression and hiperpools. It was also V3 that introduced I/O parallelism for the first time.

We’ll stop here for today and continue our short history of DB2 in my next DB2Portal blog posting. See you soon...

Wednesday, December 02, 2009

Wordle of my DB2 Portal Blog

Wordle: DB2 Portal Wordle The "jumble" of words shown here is a Wordle, which is a "word cloud" of text. I fed my blog location into the Wordle generator and it created this pretty picture based on the words I most commonly use here in this blog.

The cloud gives greater prominence to words that appear more frequently in the source text. No surprise that "DB2" and "data" dominate the other words!

Monday, November 23, 2009

Reading Things That Aren't There... and Missing Things That Are!

You can shoot yourself in the foot using DB2 if you are not careful. There are options that you can specify that may cause you to read data that is not really in the database. And, alternately, you can set things up so that you miss reading data that is actually in the database.

How, you might be asking? Well, dirty reads will take care of the first one. Specifying ISOLATION(UR) implements read-through locks, which is sometimes referred to as a dirty read. It applies to read operations only. With this isolation level data may be read that never actually exists in the database, because the transaction can read data that has been changed by another process but is not yet committed.

Read uncommitted isolation provides the highest level availability and concurrency of the isolation levels, but the worst degree of data integrity. It should be used only when data integrity problems can be tolerated. Certain types of applications, such as those using analytical queries, estimates, and averages are likely candidates for read uncommitted locking. A dirty read can cause duplicate rows to be returned where none exist or no rows may be returned when one (or more) actually exists. When choosing read uncommitted isolation the programmer and DBA must ensure that these types of problems are acceptable for the application.

OK, so what about not reading data that is in the database? DB2 V9 provides us an option to do just that. In DB2 9 it is possible for a transaction to skip over rows that are locked. This can be accomplished by means of the SKIP LOCKED DATA option within your SQL statement(s). SKIP LOCKED DATA can be specified in SELECT, SELECT INTO, and PREPARE, as well as searched UPDATE and DELETE statements. You can also use the SKIP LOCKED DATA option with the UNLOAD utility.

When you tell DB2 to skip locked data then that data is not accessed and your program will not have it available. DB2 just skip over any locked data instead of waiting for it to be unlocked. The benefit, of course, is improved performance because you will not incur any lock wait time. But it comes at the cost of not accessing the locked data at all. This means that you should only utilize this clause when your program can tolerate skipping over some data.

The SKIP LOCKED DATA option is compatible with cursor stability (CS) isolation and read stability (RS) isolation. But it cannot be used with uncommitted read (UR) or repeatable read (RR) isolation levels. DB2 will simply ignore the SKIP LOCKED DATA clause under UR and RR isolation levels.

Additionally, SKIP LOCKED DATA works only with row locks and page locks. That means that SKIP LOCKED DATA does not apply to table, partition, LOB, XML, or table space locks. And the bigger the lock size, the more data that will be skipped when a lock is encountered. With row locking you will be skipping over locked rows, but with page locking you will be skipping over all the rows on the locked page.

Use both of these features with extreme care and make sure that you know exactly what you are telling DB2 to do. Otherwise, you might be reading more... or less than you want!

Tuesday, November 17, 2009

Replacing UNION with CASE

When a UNION is required to put together data from multiple queries, you might be able to use a CASE statement instead. This is very useful, particularly when the data for each of the queries in the UNION come from the same table. The CASE statement can potentially enhance performance by minimizing the number of times the data is read.

Let’s look at an example to clarify why:

SELECT CREATOR, NAME, 'TABLE'
FROM SYSIBM.SYSTABLES
WHERE TYPE = 'T'
UNION
SELECT CREATOR, NAME, 'VIEW '
FROM SYSIBM.SYSTABLES
WHERE TYPE = 'V'
UNION
SELECT CREATOR, NAME, 'ALIAS'
FROM SYSIBM.SYSTABLES
WHERE TYPE = 'A'
ORDER BY NAME;

This simple SQL statement uses UNION to put together the results of three queries against the SYSTABLES table. The report shows all of the DB2 table-like objects that exist in the DB2 subsystem: tables, views, and synonyms.

To do this, DB2 must scan through the table three times – once for each query (as there is no index on the TYPE column). But, you can use CASE and code an equivalent, but more efficient query, as follows:

SELECT CREATOR, NAME,
CASE TYPE
WHEN 'T' THEN 'TABLE'
WHEN 'V' THEN 'VIEW '
WHEN 'A' THEN 'ALIAS'
END
FROM SYSIBM.SYSTABLES
ORDER BY NAME;

This new query will need to scan SYSTABLES only once. The CASE statement will translate the code in the TYPE column into the text that we desire.

CASE statements are very powerful and you should use them when you can to create elegant and optimal SQL in your DB2 applications.

Tuesday, November 03, 2009

Deprecated Features Planned for DB2 X for z/OS

Everyone is always interested in the latest and greatest features of their favorite DBMS, in this case DB2. But sometimes features get removed from the DBMS when a new version is released. According to the IBM teleconference on DB2 X for z/OS today, there are several features planned to be deprecated (i.e. removed). Let's briefly take a look at them.

The first feature that will be removed is private protocol DRDA. This should come as no surprise to anybody since IBM has been indicating that private protocal distribution was on its way out for a number of releases now. And it really is not that difficult to convert to DRDA (and there will be some new help in DSNTP2DP).

Perhaps more troublesome to some shops will be the removal of support for plans containing DBRMs. You will have to convert the DBRMs to packages when you get to DB2 X. You would think that since packages have been available as far back as V2.3 that most shops would have converted to them already. But I do know that there are some shops out there" who still have DBRMs bound directly into their plans. So start planning to convert ASAP! IBM will offer some support in DB2 X to track down affected plans.

While on the topic of plans and packages, you will have to REBIND any that have not been rebound since V5 or before. But it is a good rule to REBIND everything when you go to a new version of DB2 anyway, and most (but not all) shops do that.

From a documentation perspective, BookManager will no longer be supported. Instead we get the PDF versions and the Info Center online.

And the DB2 Management Clients (DB2 Administration Server, Control Center, and Development Center) are deprecated. IBM's new management client direction for DB2 is Data Studio.

Other deprecated items include:
  • ACQUIRE(ALLOCATE) [change to ACQUIRE(USE)]
  • Workload capture through profile monitor
  • XML Extender [change to use pureXML data type]
  • DB2 MQ XML user-defined functions and stored procedures [change to XML functions]
  • msys for Setup DB2 Customization Center [change to install panels]
At this point, DB2 X is on track to become generally available at the end of next year, 2010. At this point IBM has announced that migration will be supported from DB2 9 NFM (and has hinted that they are considering supporting migration to DB2 X directly from DB2 V8 NFM).

Monday, November 02, 2009

New DB2 Twitter List

Just a very quick post this morning to let all you DB2 Twitter folks out there know that I created a list of the DB2 tweeters I know about at http://twitter.com/craigmullins/db2-folks.

If you are a DB2 professional and I left you off the list please leave a comment here or drop me an e-mail and I'll be happy to add you.

Friday, October 30, 2009

IOD2009 Day Three – Malcolm Gladwell

Today’s blog entry is a little late seeing as how this is Friday and IOD is over, but I’m writing about Wednesday morning’s keynote session highlighted by Malcolm Gladwell.

For those who do not know him, Malcolm Gladwell is a Canadian journalist and author best known as the author of the books The Tipping Point (2000), Blink (2005), and Outliers (2008). I’ve read all three of them and I highly recommend that you do, too. He also has a new book, What The Dog Saw, that I bought at the airport on the way home from IOD. I hope it is as good as the other three!

Gladwell’s books deal with the unexpected implications of research in the social sciences. He spoke about some of these during his keynote session in a very entertaining and informative way. It was especially rewarding to hear him tie his messaging into the conference them of information-led transformation… and to hear him call all of the attendees mavens (read his books to understand that term).

Gladwell began his talk by noting the irony of hosting a conference dealing with information analytics in Las Vegas, of all places. You would think that the casinos might have an interest in that topic!

The major idea conveyed by Gladwell during his talk focused on change, and how it never occurs the way you think it will. He explained how radical changes happen much more quickly than we imagine. And he used a story about how the broadcast of a major prize fight transformed radio from a niche product to a transformative one.

He also talked about reframing as being necessary to elicit major changes. Prior to broadcasting the boxing match, radio was used to deliver news and classical music. But reframing it as a product for the delivery of real-time sports coverage – reframing its use – caused major transformation. Gladwell also highlighted Apple and the iPod. The iPod was not the first portable MP3 product, but it is the most successful. Because Apple simplified the interface and promoted it simply… that is, they reframed the issue!

I also enjoyed Gladwell’s story about how he purchased a laptop at CompUSA. Upon entering the store he was confronted with tables and tables of similar looking laptops and had no way to differentiate them. He tried the sales people but as anyone who ever went to CompUSA knows, they either couldn’t be found when you needed them or knew nothing about technology. So he called up his brother, who works in IT. He is not a well-known expert, but he is Malcolm’s maven – the guy he turns to for help. And he told him which laptop to purchase. Gladwell then went back to the CompUSA and he said he stood there pointing to the laptop his brother told him about. And about a half hour later the crack CompUSA salesman came to help him.

This story highlighted the concept of the maven. He also mentioned that if the executives were watching how he confidently pointed to the machine he wanted that it might have led them to erroneously believe that CompUSA needed less experienced… and, indeed, fewer sales people. Which might have led to their demise.

All in all, Gladwell was entertaining and informative – a very powerful combination.

The other highlight of the keynote session, for me at least, was that I was mentioned by name as the most prolific Twitter-er at the IOD conference. Now does that mean I was the most helpful or the most annoying… I’ll leave that to you all to decide.

Wednesday, October 28, 2009

IOD2009 Day Two

Day two of the IBM Information on Demand conference was just as informative and exciting as day one. The day kicked off with a general session titled "A New Kind of Intelligence for a Smarter Planet." The idea presented is that the world is changing. It is becoming more instrumented, interconnected, and intelligent. Basically, as Steve Mills of IBM clarified, the ability to embed intelligence into millions of things will lead the transformation to an information led smarter planet. And that this information-led transformation will create opportunities for organizations to strategically gain control of information and create a new kind of intelligence.

Expanded intelligence begins with sensors and metering, which is doable today because price points have become reasonable now. There are 1 billion transistors per human, today. And an estimated 2 billion people will soon be on the Internet. At the same time, we are moving toward one trillion connected objects (video cameras, GPS devices, healthcare instruments, and even livestock). And there are hundreds of satellites orbiting the Earth, generating terabytes of data each and every day.

As we begin to intelligently interconnect these devices and analyze their streaming data, a transformative opportunity can results...

IBM brought up three customers to talk about how their organization were helping to transform to a smarter planet. The customers came from Cardinal Health, Statoil Hydro ASA (Norway), and the Food and Drug Administration. Highlights of the panel discussion:
  1. Security in the supply chain for food is absolutely critical and food safety is one of the most complex systems to deal with.
  2. The US imports 50pct of its food - from over 150 different countries.
  3. Every food supplier to US (domestic or foreign) must be registered with the FDA.
  4. And each supplier must complete forms as to the safety of its food. This can be difficult since suppliers range from large agri-businesses to small farms many of whom do not have a computer at all. So some records are probably kept on paper in shoe boxes.
  5. Supply chain security is very important in the health care (drugs) and oil industries too! In fact, it was discussed how each of these seemingly disparate industries face many similar challenges.
  6. Preventing problems requires understanding risk. And complexity requires collaboration in order to succeed becase nobody has enough resources to do it all alone.
  7. In some areas (such as remote parts of Norway) instrumentation is essential to even get the data because nobody wants to go there.
  8. Legacy systems often are rich sources of information (hey, that probably means mainframes!)
  9. Analytics are required for prevention of problems, but also aid in reaction to problems that actually occur too. No one prevents 100 percent of their problems.
After the panel IBM came back and wrapped it up. They mentioned how IBM was awarded the National Medal of Honor for their Blue Gene supercomputer for DNA sequencing. It was a very informative and entertaining general session.

I then attended the DB2 9.7 versus Oracle 11g Smackdown presentation. It was chock full of statistics on why IBM's DB2 is superior to Oracle in terms of cost. The presenter explained how DB2
outperforms Oracle on TPC-C benchmarks for the same test, on the same machine, at the same point in time. He cautioned folks to to read the small print details on all benchmark results... for example, if you are examining the cost of ownership double check to see whether the benchmark uses a term or full purchase license. Also, the cost of the database license depends a great deal on your ability to negotiate a discount (if the vendor will discount). And you also need to be aware of how the products are licensed. Some features are separately licensed for both DB2 and Oracle. The bottom line is that licensing can cause a more than 30 percentt swing in price performance results

But do people even believe these benchmarks any more? I don't think very many people put much stock in benchmark tests today.

The author frequently cited an independent study by ITG that compares the value proposition of DB2 9.7 versus Oracle Database 11g. You can read the study yourself at this link.

(Note to my regular z/OS readers: the previous discussion was all about DB2 LUW and not DB2 for z/OS).

I also got to attend a special briefing for bloggers on IBM's new stream computing solution, which I blogged about on my Data Management Today blog if you are interested.

And finally, I'll plug my Twitter feed again. You can follow my tweets at IOD this week (well, thru Wednesday) by following me on Twitter at http://www.twitter.com/craigmullins.