Posted by: Peter Scott on: August 4, 2007
I see that Oracle have posted a series of white papers on the soon to arrive 11g release, the one on partitioning caught my eye.
But first an aside: Does anyone actually use slower storage for less-accessed data? The theory sounds good: put the-once-in-a-blue-moon stuff on slower, cheaper, disk and the “use all of the time” data on the high grade, expensive, disk or even solid-state disk… but in practice do people just slap the data on to their 100TB corporate SAN? When’s the last time you saw someone buying a couple of hundred GB JBOD array of SATA disk when they already the space on a corporate NAS /SAN?
I have always liked Oracle’s implementation of partitioning, it is flexible, and transparent to queries – that is we don’t need a special syntax to query partitioned data. I could do most things that I wanted to in the 9.2 and 10g releases but some things required a little creativity. But this time around there will be some nice features – and some should even appeal to the OLTP community. Significantly the number of options for partition key has significantly increased and for composite partitioning (sub partitions) almost any combination goes – I am not sure that HASH-HASH would be useful feature to many people though. New partitioning scheme types are:
Second aside: I am not a great lover of hash partitions. Partitioning should be a tool to improve both manageability and performance, but to my thinking hash partitions are not about boosting performance in DW systems where we have many ‘multi partition key’ queries and end up hitting almost all of the hash partitions, although we might still get partition-wise joins and some parallelism gains we just don’t get the partition pruning
I think it depends on what the rate of data growth is over the next few years. Early-adopter industries would probably be those like medical imaging for example. One of the things causing an explosive data growth rate right now is media; higher-resolution images and documents, audio, and video… hey – I was just looking at hand-held HD camcorders last week for only a couple hundred bucks. As that data starts making its way into DW systems I think we might see greater demand for large-scale ILM – perhaps something like what Oracle has architected. But really this is all just speculation and guesswork on my part – who knows!
This takes me back to circa 1980, I was working on PDP inventory systems and taking a UCLA class on “online systems.” The class went on a field trip to a mysterious multistory building in downtown LA which was a large oil companies data center, big IBM iron with cooling radiators on the roof. They had a tape robot that was about 150 feet long, which would zip back and forth pulling tapes from racks and mounting them on tape drives – there were a few dropped on the floor. On one end, it had hexagonal data cartridges in a beehive looking contraption that would be plugged by a mechanical arm into a reader.
Now, that was partitioning!
I just bought two 500G USB boxes about the size of videotapes, for backing up the home computers. $109.99 each.
August 6, 2007 at 3:36 pm
I’m not sure a lot of people are using slower storage to implement in-database ILM right now but I do think that it’s a good forward-thinking feature. As data volumes grow it might become much more important; maybe in 5 years a lot more people will be doing it.