Thursday, May 13, 2010

IDUG NA 2010, Days Two and Three

I’ve been running around kinda busy the past couple of days here at IDUG in Tampa, so I got a bit behind in blogging about the conference. So, today I’m combining two days of thoughts into one blog post.

(For a summary of IDUG Day One, click here.)

I started off day two by attending Brent Gross’ presentation on extracting the most value from .NET and ODBC applications. Brent discussed some of the things to be aware of when developing with .NET, an important “thing” being awareness that .NET is designed to work in a disconnected data architecture. So applications will not go through data a row at a time but instead send the data to the application and let it process it there. As an old mainframe DBA that caused alarm bells to ring.

I also got the opportunity to hear Dave Beulke discuss Java DB2 developer performance best practices. Dave delivered a lot of quality information, including the importance of developing quality code because Java developers reuse code – and you don’t want bad code being reused everywhere, right?

Dave started out mentioning how Java programmer are usually very young and do not have a lot of database experience. So DBAs need to get some Java knowledge and work closely with Java developers to ensure proper development. He also emphasized the importance of understanding the object to relational mapping method.

From a performance perspective Dave noted the importance of understanding the distributed calls (how many, where located, and bandwidth issues), controlling commit scope, and making sure your servers have sufficient memory. He also indicated that it is important to be able to track how many times Java programs connect to the database. He suggested using a server connection pool and to be sure that threads are always timed out after a certain period of time.

And I’d be remiss if I didn’t note that Dave promoted the use of pureQuery, which can be used to turn dynamic JDBC into static requests. Using pureQuery can improve performance (perhaps as much as 25 percent), as well as simplifying debugging & maintenance.

Dave also discussed how Hibernate can cause performance problems. Which brings me to the first session I attended on day three, John Mallonee’s session titled Wake Up to Hibernate. Hibernate is a persistent layer that maps Java objects to relational tables. It provides an abstraction layer between DB2 and your program. And it can also be thought of as a code generator. Hibernate plugs into popular IDEs, such as Eclipse and Rational tools. It is open source, and part of JBoss Enterprise Middleware (JBoss is a division of Red Hat).

John walked attendees through Hibernate, discussing the Java API for persistence, its query capabilities (including HQL, or Hibernate Query Language), and configuration issues. Examples of things that are configurable include JDBC driver, connection URL, user name, DataSource, connection pool settings, SQL controls (logging, log formatting), and the mapping file location.

HQL abstracts SQL. It is supposed to simplify query coding, but from what I saw of it in the session, I am dubious. John warned, too, that when HQL is turned into SQL the SQL won’t necessarily look the way you are used to seeing it. He recommended to setup the configuration file such that it formats the generated SQL or it won’t be very readable. John noted that one good thing about HQL is that you cannot easily write code with literals in them; it forces you to use parameter markers.

OK, so why can Hibernate be problematic? John talked about four primary concerns:

  1. SQL is obscured
  2. performance can be bad with generated code
  3. Hibernate does not immediately support new DB2 features
  4. Learning curve can be high

But he also noted that as you learn more about these problems -- and how Hibernate works -- that things tend to improve. Finally (at least with regard to Hibernate) John recommends that you should consider using HQL for simple queries, native SQL for advanced queries, for special situations use JDBC, and to achieve the highest performance use native DB2 SQL (e.g. stored procedure).

I also attended two presentations on the DB2 for z/OS optimizer. Terry Purcell gave his usual standout performance on optimization techniques. I particularly enjoyed his advice on what to say when someone asks why the optimizer chose a particular path: “Because it thinks that is the lowest cost access path.” After all, the DB2 optimizer is a cost-based optimizer. So if it didn’t choose the “best” path then chances are you need to provide the optimizer with better statistics.

And Suresh Sane did a nice job in his presentation in discussing the optimization process and walking thru several case studies.

All-in-all, it has been a very productive IDUG conference… but then again, I didn’t expect it to be anything else! Tomorrow morning I deliver my presentation titled “The Return of the DB2 Top Ten Lists.” Many of you have seen my original DB2 top ten lists presentation, but this one is a brand new selection of top ten lists… and I’m looking forward to delivering it for the first time at IDUG…

Wednesday, May 12, 2010

IDUG Tampa 2010, Day One

As usual, the North American IDUG conference is proving to be a hectic, yet enjoyable and informative time. The days are packed from morning til evening with technical sessions, networking, and running from here to there and back again.

Tuesday was the first day for normal IDUG sessions (the day-long seminars were moved to Monday this year), and the day was dominated (for me at least) by DB2 10 sessions. The spotlight session by Jeff Josten was an information-packed 90 minutes overview of DB2 10 that can only be described as drinking from a firehose. Myself and about 200 other curious attendees sat in attention as Jeff discussed the features that back up the themes of Versionn 10, which are efficiency, resilience, and growing new workloads on DB2 for z/OS.

Jeff didn’t share a GA date for the new version, nor would anyone else from IBM this week, but it has been strongly hinted that it could be before the end of the year (2010).

The biggest “thing” being touted by IBM about DB2 10 is the performance gains it delivers right out-of-the-box. Jeff discussed IBM’s performance objectives as historically being to deliver less than a 5% performance regression from release to release. But things have perked up recently. For DB2 9, most customers reported no regression or gain out of box. And the new goal is no longer containing regression, but delivering gain. For DB2 10, the expectation is that many customers will reduce CPU time 10% to 20% right out-of-the-box.

In IBM’s labs, Jeff indicated that the out-of-the-box CPU reduction numbers for traditional workloads are ranging from 5-10% and for newer workloads (e.g. TCP/IP, stored procedures) the improvement is as much as 20% in lab measurements. And when you start using new functionality, you can reasonably expect to see up to 10% CPU reduction. Of course, Jeff was careful to note that these are pre-GA numbers so things could change, even though there is no expectation that they will change.

Additionally, there is a lot of focus on scalability in DB2 10. Shops can expect to support 5x to 10x more concurrent users, up to 20,000 per subsystem. This is possible due to virtual storage relief: threads have been moved above the bar.

Jeff went on to cover a lot of additional new functionality to be delivered with DB2 10 including parellel index update during INSERT (which should speed up inserts against tables with multiple indexes), DB2’s usage of 1MB page size (z/OS) in buffer pools, multiple SQL access path and performance improvements, efficient caching of dynamic SQL with literals, LOB streaming between DDF and rest of DB2, Workfile spanned records (PBG), INSERT improvements for UTS, solid state disk monitoring and exploitation, temporal data support, timestamp data type improvements, and more.

Hash support is particularly interesting. With hashing you can get direct access to data with a single getpage instead of the multi-getpage approach of b-tree indexing. The targeted use case for hashes is for lookup of a row based upon primary key. The hashing algorithm is stored in the DB2 engine. Never fear, though, because you can still define additional indexes on hashed tables and the optimizer will understand and prefer hashed access when it is possible. (I hear the IMS DBAs out there laughing. DB2 DBAs are now going to need to understand space calculations for hash space and what collisions and overflow means.)

Next up was Roger Miller who covered DB2 10 from a database administration perspective. He began his session by referencing the extra detail that is available in the DB2 10 webcast presentation that Roger did about last month, which is available on the web.

Roger states that a lot of what is at the heart of DB2 10 is about making things easier for DBAs. And then to prove his point he talked for an hour about all of those things. Highlights included the reduced need for REORG, monitoring enhancements, hashing, and pureXML enhancements for usability, scalability, and performance.

A particularly interesting point made by Roger is that query parallelism these days is less about decreasing elapsed time and more about the ability to shuttle workload to a zIIP.

Roger also discussed the ability to skip V9 and go directly from V8 to V10. He also expressed concern that folks who choose to do this not ignore learning all about V9 when they do this. For example, RUNSTATS in V9 had key changes, so shops need to be careful to run RUNSTATS when moving to V10.

Roger also spoke about the significant changes to the DB2 Catalog and DB2 Directory in DB2 10. There are about 60 new table spaces, the links have been removed, inline LOBs are used in many places, and row level locking is used. These changes mean that online REORG works for everything in the catalog & the directory.

He also spoke about the various improvements to security administration in DB2 10. There is a new SECADM authority with no access to data and there is also a new option for DBADM without data access. Another nice new option is DBADM authority for every database in the subsystem. And then there is the ability to REVOKE without cascading, something that DB2 security administrators have been looking for for years!

Changing pace, I attended Billy Sundarrajan’s presentation on “De-mystifying JDBC Universal Drivers – for the z/OS DBA.” The reality is that more and more dynamic SQL applications are being implemented, so knowing about JDBC drivers is a necessity, not a luxury for the mainframe DBA.

Billy discussed the types of JDBC drivers and the installation issues involved. You can connect using a type 2 or type 4 driver. The Type 2 driver connects directly without DB2 Connect gateway; Type 4 driver connects thru DB2 Connect gateway.

He also discussed the benefits of setting end user variables for monitoring and the different properties that can be used for configuration.

Of course, I attended a few other sessions and spent some time at the exhibit hall and caught up with some old friends and… well, this is long enough of a post for the first day… check back tomorrow for a shorter (I promise) synopsis of day two.

Sunday, May 09, 2010

IDUG in Tampa

It is Sunday, May 9, 2010 and I'm posting a brief blog entry today to remind everyone about IDUG in Tampa this week. I will be attending (arrive Monday morning) and I will update my blog with the highlights of what is happening in Tampa this week... so be sure to check in regularly.

Friday, April 30, 2010

On Becoming a DBA

Perhaps the most frequent question I am asked is: How can I become a DBA?

The answer, of course, depends a lot on what you are currently doing. Programmers who have developed applications using a database system are usually best-suited to becoming a DBA. They already know some of the trials and tribulations that can occur when accessing a database.

If you are a programmer and you want to become a DBA, you should ask yourself some hard questions before you pursue that path. First of all, are you willing to work additional, sometimes crazy, hours? Yes, I know that many programmers work more than 40 hours already, but the requirements of the DBA job can push people to their limits. It is not uncommon for DBAs to work late into the evening and on weekends; and you better be ready to handle technical calls at 2:00 a.m. when database applications fail.

Additionally, you need to ask yourself if you are insatiably curious. A good DBA must become a jack-of-all-trades. DBAs are expected to know everything about everything -- at least in terms of how it works with databases. From technical and business jargon to the latest management and technology fads, the DBA is expected to be "in the know." And do not expect any private time: A DBA must be prepared for interruptions at any time to answer any type of question -- and not just about databases, either.

And how are your people skills? The DBA, often respected as a database guru, is just as frequently criticized as a curmudgeon with vast technical knowledge but limited people skills. Just about every database programmer has his or her favorite DBA story. You know, those anecdotes that begin with "I had a problem..." and end with "and then he told me to stop bothering him and read the manual." DBAs simply do not have a "warm and fuzzy" image. However, this perception probably has more to do with the nature and scope of the job than with anything else. The DBMS spans the enterprise, effectively placing the DBA on call for the applications of the entire organization. As such, you will interact with many different people and take on many different roles. To be successful, you will need an easy-going and somewhat amiable manner.

Finally, you should ask yourself how adaptable you are. A day in the life of a DBA is usually quite hectic. The DBA maintains production and test environments, monitors active application development projects, attends strategy and design meetings, selects and evaluates new products and connects legacy systems to the Web. And, of course: Joe in Accounting just resubmitted that query from hell that's bringing the system to a halt. Can you do something about that? All of this can occur within a single workday. You must be able to embrace the chaos to succeed as a DBA.

Of course, you need to be organized and capable of succinct planning, too. Being able to plan for changes and implement new functionality is a key component of database administration. And although this may seem to clash with the need to be flexible and adaptable, it doesn't really. Not once you get used to it.

So, if you want to become a DBA you should already have some experience with the DBMS, be willing to work long and crazy hours, have excellent communication and people skills, be adaptable and excel at organization. If that sounds like fun, you'll probably make a good DBA.

Thursday, April 22, 2010

The Ever-Changing Role of the DBA

Defining the job of DBA is getting to be increasingly difficult. Oh, most people know the rudimentary aspects of the job, namely keeping your organization's databases and applications running up to par. The DBA has to be the resident DBMS expert (whether that is DB2, Oracle or SQL Server, or most likely a combination of those). He or she has to be able to solve thorny performance problems, ensure backups are taken, recover and restore data when problems occur, make operational changes to database structures and, really, be able to tackle any issue that arises that is data-related.

The technical duties of the DBA are numerous. These duties span the realm of IT disciplines from database design to physical implementation and consistent, on-going monitoring of the database environment.

DBAs must possess the abilities to create, interpret, and communicate a logical data model and to create an efficient physical database design from a logical data model and application specifications. There are many subtle nuances involved that make these tasks more difficult than they sound. And this is only the very beginning. DBAs also need to be able to collect, store, manage, and query data about the data (metadata) in the database and disseminate it to developers that need the information to create effective application systems. This may involve repository management and administration duties, too.

After a physical database has been created from the data model, the DBA must be able to manage that database once it has been implemented. One major aspect of this management involves performance management. A proactive database monitoring approach is essential to ensure efficient database access. The DBA must be able to utilize the monitoring environment, interpret its statistics, and make changes to data structures, SQL, application logic, and the DBMS subsystem to optimize performance. And systems are not static, they can change quite dramatically over time. So the DBA must be able to predict growth based on application and data usage patterns and implement the necessary database changes to accommodate the growth.

And performance management is not just managing the DBMS and the system. The DBA must understand SQL used to access relational databases. Furthermore, the DBA must be able to review SQL and host language programs and to recommend changes for optimization. As databases are implemented with triggers, stored procedures, and user-defined functions, the DBA must be able to design, debug, implement, and maintain these code-based database objects as well.

Additionally, data in the database must be protected from hardware, software, system, and human failures. The ability to implement an appropriate database backup and recovery strategy based on data volatility and application availability requirements is required of DBAs. Backup and recovery is only a portion of the data protection story, though. DBAs must be able to design a database so that only accurate and appropriate data is entered and maintained - this involves creating and managing database constraints in the form of check constraints, rules, triggers, unique constraints, and referential integrity.

DBAs also are required to implement rigorous security schemes for production and test databases to ensure that only authorized users have access to data. As industry and governmental regulations multiply, the need to audit who did what to which data when is also a requirement for sensitive data in production systems – and the DBA must be involved in ensuring data auditability without impacting availability or performance.

And there is more! The DBA must possess knowledge of the rules of relational database management and the implementation of many different DBMS products. Also important is the ability to accurately communicate them to others. This is not a trivial task since each DBMS is different than the other and many organizations have multiple DBMS products (e.g., DB2, Oracle, SQL Server).

Remember, too, that the database does not exist in a vacuum. It must interact with other components of the IT infrastructure. As such, the DBA must be able to integrate database administration requirements and tasks with general systems management requirements and tasks such as network management, production control and scheduling, and problem resolution, to name just a few systems management disciplines. The capabilities of the DBA must extend to the applications that use databases, too. This is particularly important for complex ERP systems that interface differently with the DBMS. The DBA must be able to understand the requirements of the application users and to administer their databases to avoid interruption of business. This includes understanding how any ERP packages impact the business and how the databases used by those packages differ from traditional relational databases.

But Things Are Changing

So at a high level, DBAs are tasked with managing and assuring the integrity and efficiency of database systems. But keep in mind, too, that there are actually many different DBAs. Some focus on logical design; others focus on physical design; some DBAs specialize in building systems and others specialize in maintaining and tuning systems; and there are specialty DBAs and general-purpose DBAs. Truly, the job of DBA encompasses many roles.

Some organizations choose to split DBA responsibilities into separate jobs. Of course, this occurs most frequently in larger organizations, because smaller organizations often cannot afford the luxury of having multiple, specialty DBAs.

Still other companies simply hire DBAs to perform all of the tasks required to design, create, document, tune, and maintain the organization’s data, databases, and database management systems.

But no matter what "type" of DBA you happen to be, chances are that your role is changing and adapting to new types of computing and data requirements. Indeed, one of the biggest challenges for DBAs these days is the ongoing redefinition of the job roles and responsibilities.

The primary role of database "custodian," of course, continues to be the main emphasis of the job. But that is no longer sufficient for most organizations. The DBA is expected to take on numerous additional -- mostly technical -- roles. These can include writing application code, managing the application server, enterprise application integration, managing Web services, network administration and more.

If you compare the job description of DBAs across several organizations, it is likely that no two of them would match exactly. This is both good and bad. It is good because it continually challenges the technically-minded employees who tend to become DBAs. But it can be bad, too; because the job differs so much from company to company, it becomes more difficult to replace a DBA who leaves or retires. And no one can deny that database administration is a full-time, stressful job all on its own. But the stress level just keeps increasing as additional duties get tacked onto the DBA's "to do" list.

Summary

There are many jobs that DBAs perform and it can be confusing when you try to match job title up against the responsibilities of the job. Don't let your job title keep you from expanding into other, related disciplines. The more you know and the more you can do, the more employable you become... and that is important in this day and age!