Showing posts with label partitioned table space. Show all posts
Showing posts with label partitioned table space. Show all posts

Tuesday, September 08, 2020

Know Your Db2 Universal Table Spaces

 One of the biggest changes in the last decade or so has been the introduction of new types of table spaces – known as Universal table spaces, or UTS. Not only are UTS new to Db2, they are quickly becoming the de facto standard type of table space for Db2 applications, new and old.

At some point, Universal table spaces will displace your existing segmented and classic partitioned table spaces. We’ll examine why this is so later in the post, but first let’s briefly describe what Universal table spaces are.

Two Types of Universal Table Spaces

Introduced in Db2 9 for z/OS, Universal table spaces combine the best attributes of partitioned and segmented table spaces. If you do not know what partitioned and segmented table spaces are, I refer you to this older article I wrote on DB2 Table Space Options to bring you up to speed (note that this article is almost 20 years old at this point).

Universal table spaces offer improved space management for variable length rows because they use space map pages (like segmented table spaces). Also, like segmented table spaces, UTS deliver improved mass delete performance, and you can immediately reuse the table segments after the mass delete. And like partitioned table spaces, Universal table spaces can grow large (up to 128TB of data) and consist of multiple partitions.

At a high-level, there are two types of Universal table spaces:

1.     Partition-by-growth (PBG): The PBG UTS creates new partitions as the amount of data grows without the need to specify key ranges. This type of UTS is beneficial for tables that grow over time and need the additional limits afforded by partitioning but can benefit from the performance of segmented.

2.     Partition-by-range (PBR): The range-partitioned, or PBR UTS requires a key range for partitioning like classic partitioned table spaces. A PBR UTS basically adds segmentation to the existing partitioned table space.

 

Both types of UTS can contain only a single table, but IBM presentations have indicated that this is likely to change at some point in the future (although nothing has been announced or confirmed for certain).

A partition-by-range UTS is basically a segmented, partitioned table space. The limit key ranges must be specified in the table DDL. Index partitioning, which was supported for the earliest classic partitioned table spaces, is not supported for a PBR UTS. So before converting your classic partitioned table spaces to PBR UTS, you must first convert from index-controlled partitioning to table-controlled partitioning. Check out this blog post for a trick to quickly convert to table-controlled partitioning.

The second type of UTS is the partition-by-growth Universal table space. As its name implies, a PBG UTS can automatically add a new partition as the data in the table space grows. Over time, as the UTS is used by applications, data gets added to the table. When the PBG UTS reaches its maximum size, a new partition is automatically added to the table space. The new partition uses the same characteristics as the existing partitions, including compression details, free space, and so on.

You control the type of UTS using the DDL keywords: NUMPARTS, MAXPARTITIONS, and SEGSIZE. To create a PBR UTS you specify both NUMPARTS and SEGSIZE. To get a PBG UTS you must code the MAXPARTITIONS and SEGSIZE parameters. MAXPARTITIONS indicates the limit on the total number of partitions that a PBG UTS can grow to. Be careful, because if you only code the NUMPARTS parameter without SEGSIZE, then you will create a traditional partitioned table space. If you only code the SEGSIZE parameter (without either NUMPARTS or MAXPARTITIONS) you will create a traditional segmented table space.

Db2 12 for z/OS

A significant new feature for supporting big data was introduced in Db2 12, relative page numbering (or RPN) for range-partitioned table spaces. An RPN range-partitioned table space can be created, or an existing range-partitioned table space can be changed to RPN via an ALTER TABLESPACE with PAGENUM RELATIVE, followed by an online REORG of the entire table space.

An RPN table space delivers many benefits for availability and storing large amounts of data. This requires an expanded RID, which increases from 5 bytes to 7 bytes.

From an availability perspective, you can specify DSSIZE at the partition level for RPN table spaces. Furthermore, the allowable DSSIZE value is no longer dependent on the page size and number of table space partitions. The DSSIZE change can be an immediate change (no online REORG required to take effect) as long as the change does not decrease the DSSIZE value. You still can decrease DSSIZE, but only at the table space level.

From a big data perspective, the DSSIZE can grow up to 1 TB for a partition. And the maximum table size increases to 4 PB with approximately 280 trillion rows per table. That is a lot of data that can be stored. Think about it this way: if you were to insert 1000 rows per second it would take more than 8000 years to fill the table to capacity!

Why Are Universal Table Spaces the Future of Db2?

As of today (September 2020, Db2 12 for z/OS), there are basically five types of table spaces from which to choose:

1.     Segmented table spaces

2.     Universal Partition-By-Growth (PBG) table spaces

3.     Universal Partition-By-Range (PBR) table spaces

4.     Universal Partition-By-Range Relative Page Number table spaces

5.     Classic partitioned table space

Of course, for new databases, it is best to remove the classic partitioned table space from consideration because the PBR UTS is more efficient (and classic partitioning will likely be deprecated at some point). Technically speaking, there are actually two other types of table spaces (LOB and XML table spaces), but they are not general-purpose table spaces and can be used only in specific situations (with LOB and XML data).

So why do I advise that you favor Universal table spaces over segmented whenever you can? Well, for a number of reasons. First of all, because Universal table spaces are newer and all you really need for most every Db2 implementation. Secondly, because many new features of Db2 can only be used with Universal table spaces. Newer features that only work with UTS include:

·     Clone tables

·     Hash-organized tables

·     Currently committed locking

·     Pending DDL

·     Inline LOBs

·     XML multi-versioning

·     ALTER TABLE with DROP COLUMN

And this trend is likely to continue. As IBM introduces new function levels and versions of Db2 with new features that only work with UTS, it will become increasingly difficult for DBAs to keep track of which table spaces are not UTS so that they can make sure they are not using any new features that will not work with their old types of table spaces.

What this means is that other than Universal table spaces, the only other type you should be using are segmented table spaces and then only when you absolutely must have a multi-table table space. Therefore, the best practice I recommend is to make all new table spaces Universal (except for multi-table table spaces which can be segmented).

So, what is the future of the segmented table space? For the immediate future, segmented table spaces will continue to be supported. My guess is that, at some point, IBM will deliver a multi-table UTS capability, and then at some point deprecate segmented table spaces. But this is only my guess. As of the date I am writing this, IBM has not committed to a multi-table UTS and the segmented table space is still the recommended (and only) method for assigning multiple tables into a single table space.

My general recommendation though is that you avoid multi-table table spaces unless you have many very small tables and are close to reaching the open data set limit (200,000). Of course, your limit may be lower depending on the setting of the DSMAX subsystem parameter, which specifies the maximum number data sets that can be open at one time. Acceptable values range from 1 to 200,000; a lower setting may be specified due to operating system contraints or storage/memory limitations.

My general recommendation for table spaces is to slowly work on a conversion project to migrate your classic partitioned table spaces to PBR UTS and your segmented table spaces to PBG UTS. Doing so will bring you to the latest and greatest Db2 table space technology and position you to be able to use all new functionality in current and future versions of Db2 whenever – and wherever – you see fit.

Summary

To make sure that your systems are up-to-date and ready for new functionality it makes sense to adopt Universal table spaces for all of your Db2 tables. The only exception is for multi-table segmented table spaces, and you shouldn’t have too many of them.

Good luck Universalizing your Db2 databases!

Monday, March 20, 2017

The DB2 12 for z/OS Blog Series – Part 7: Relative Page Number Table Spaces

One of the most significant new features for supporting big data in a DB2 12 environment is relative page numbering (or RPN) for range-partitioned table spaces. You can either create a new RPN range-partitioned table space, or an existing range-partitioned table space can be changed to RPN via an ALTER TABLESPACE with PAGENUM RELATIVE, followed by an online REORG of the entire table space.

But why would you want RPN table spaces instead of the already-existing table spaces types in DB2? The simple answer is the ability to grow the amount of data you store. RPN table spaces enable you to store large amounts of data. The DSSIZE can grow up to 1 TB for a partition. And the maximum table size increases to 4 PB with up to 256 trillion rows per table. That is a lot of data that can be stored! Think about it this way: if you were to insert 1000 rows per second it would take more than 8000 years to fill the table to capacity!

With RPN table spaces you get the ability to create larger partition sizes. The maximum partition size is now 1 TB (it used to be 256 GB). So if you are reaching the capacity size of existing table space options, RPN table spaces will allow you to store a LOT more data. Of course, this requires an expanded RID, which increases from 5 bytes to 7 bytes. So that will impact the DDL for the mapping table for your online REORG utilities.

But size is not the only issue. RPN table spaces improve availability, too. You can specify DSSIZE at the partition level for RPN table spaces. So each partition can have its own, different DSSIZE specification. Furthermore, the allowable DSSIZE value is no longer dependent on the page size and number of table space partitions. The DSSIZE change can be an immediate change (no online REORG required to take effect) as long as the change does not decrease the DSSIZE value.

As you move your DB2 subsystems to Version 12, consider migrating your larger range-partitioned table spaces to RPN to take advantage of these new capabilities.


Friday, July 08, 2016

Good Old Partitioned Table Space Confusion

Partitioned table spaces have been with us in the world of DB2 forever, but even as we embark on moving to Universal table spaces, there still exists some confusion "out there" in terms of the behavior of partitioning... 

For example, consider this question:

If a table is in a partitioned table space, say four partitions, for example, then to process the table in batch can I run four instances of the batch program in parallel, one against each partition?

This type of question exhibits some confusion about how partitioned table spaces work. First of all, let's assume that we are talking about classic partitioned table spaces and/or Universal range-partitioned table spaces. 

Given that assumption, yes, you can run four instances of a batch program in parallel if you so desire. Of course, that is not necessary to get DB2 use parallel tasks to read the data. The best approach is to BIND the program (package) specifying the DEGREE(ANY) parameter. Then DB2 will choose the degree of parallelism for the batch program. This is simpler than any alternative approach you might choose to code up because DB2 handles it all for you.

If you still wish to run four instances of the batch program you, of course, can. In that case you would probably want to BIND using DEGREE(1). In order for this to work the way I infer that you intend it to work, however, you may have to modify the program somewhat. I assume that you wish each job to process only against one of the four partitions. To accomplish this, you must provide some way for the program to identify and process only the data from one of the four partitions based on the partitioning key range specified in the partitioning index. 

For example, you might choose to input parameters to the batch run specifying the key range for that program to process. As long as the program adheres to that key range you should only process data from the one partition that holds that data.

Of course, that means that your programmers must be diligent in applying the key range. Perhaps you could set up 4 views that apply the key range for each partition and then use only those views in each program. But that means you will have multiple versions of the same program. You could also specify the range as an input variable and then use only one program. But that means you have to make sure that you are sending the proper range to the input variables.

So just go with the first approach and let DB2 do it for you when it makes sense...


Monday, January 04, 2016

A Lot of Extraneous Data Sets?

In a recent blog post here I talked about a quick and dirty method of converting your partitioned table spaces from index-controlled to table-controlled. If you haven't read that post, take a moment to click over and read it here: Easily Convert to Table-Controlled Partitioning.

The reason I bring this up today is that I received an interesting e-mail from a long-time friend and DB2 DBA who read the post and had some information to share with me. He told me about how his organization used one of my tips to drop unused indexes as part of this process.

He told me that during the conversion process they dropped a lot of the clustering indexes because they weren't being used for access paths or for uniqueness. And they were able to release an "astonishing 4,100 data sets" by doing so!

Now I am not suggesting that every shop will be able to experience a similar savings, but if you have indexes that have no purpose other than enforcing index-controlled partitioning, it is time to bite the bullet and drop those indexes as you convert to table-controlled-partitioning (and then on to Universal table spaces).

And when you convert, please drop a note here on the blog to let us know how your conversion efforts went!

Tuesday, December 08, 2015

Easily Convert to Table-Controlled Partitioning

Up through DB2 V8 for z/OS, the only way to control partitioning of DB2 table spaces was by using a clustering index that specified the range of key values for each partition. With V8, though, DB2 adds the ability to specify the partitioning criteria in the CREATE TABLE specification. This is known as table-controlled partitioning and it is the preferred method for creating (non-Universal) partitioned table spaces. With table-controlled partitioning you can cluster on a different column (or set of columns) than you are partitioning on. Furthermore, you can make changes such as dropping a partitioning index or creating a table in a partitioned table space without defining any indexes at all.


But given the long history of DB2, many existing partitioned table spaces are index-controlled. 

Fortunately, there is a quick-and-dirty technique that you can use to easily convert from index-controlled to table-based partitioning. Simply follow these steps:

  • Identify the index-controlled partitioned table space you wish to convert
  • Convert the clustering index on the table to NOT CLUSTER using ALTER INDEX. (Alternately, you could drop the clustering index, but I wouldn’t recommend that unless you no longer need that index at all.)
  • Convert the index back to CLUSTER, again using ALTER INDEX
Voila! DB2 will have converted your table space to table-controlled partitioning.



Note: DB2 will also convert from index-controlled to table-controlled partitioning if you use ALTER TABLE to add a new partition, change a partition boundary, or rotate a partition to last on an index-controlled partitioned table space. But these are more intrusive methods than simply altering the index from clustering to non-clustering and back again.