Posted by: Peter Scott | February 20, 2008

Oracle OLAP, partitions and time

One of the reasons I am not writing too much on the blog at present is that I am busy working with a client just about to put a new (small) data warehouse into production. Coupled with the need to concentrate on submitting a white paper to Collaborate 08 by the weekend and I am not getting too much free time. I write this whilst an OLAP cube organised materialized view is building on a VM on my laptop.

One side note from my research on 11g OLAP is about partitioning the OLAP cube along the time dimension.

Often organisations have multiple hierarchies for reporting time, financial reporting may be aligned to a 13 x 4 week period and each week starting on a Saturday and the year starting in August, and other reporting going against the civil calendar. Of course “the day” is common to both calendars and you could build both calendars as a part of a single time dimension.

But if you put both calendars into a single time dimension it may get interesting when you decide to build an OLAP cube. If both hierarchies are used in the same cube and you elect to partition on the time dimension you will only be able to partition on one of the two hierarchies. You specify a granularity for the partition, say fiscal quarter and the fiscal periods, weeks and days will be divided into partitions based on the fiscal period they belong to, fiscal years go into a sort of ‘cap’ partition, that is the one used to catch the members that don’t belong in the other lower partitions. But what of the sales for June? June is not a member of  any fiscal quarter so that too goes into the cap partition… which could be very bad for performance.

So don’t partition on time, or build separate cubes for fiscal and calendar reporting.

I have noticed the partition advisor in OLAP 11g AWM can give some odd partitioning advice. I had a two hierarchy time dimension for some tests I am doing for a presentation but elected only to aggregate on the civil calendar. But the partition advice was to partition on the fiscal (the default hierarchy, which did not have any aggregates specified for the cube

Posted by: Peter Scott | February 16, 2008

Almost started OLAP 11g…

The proposal deadline for the upcoming Collaborate 08 was shortly before I was to leave my old employer - it was already known that I would be leaving them to join Mark and Jon (and possibly, it was known before Rittman Mead Consulting existed!) so it came as no real surprise for Mark to ask me to consider putting forward a presentation.

Moving employers meant that I could not really present some sort of “been there, done it” case study on some exciting piece of technology that I have put together for an old customer - it is just not right to pass things off as fully your own when you had a team to back you up (just as it not right to not credit the people who have helped you) So I took the brave approach, I looked at what was new in Oracle 11g for BI and found a topic that was “now, I think I could have put that to good use” and base my proposal on that. So I chose Cube Organized Materialized Views, and when my paper was accepted started to work… slowly, as I still needed to be out there earning money.

To be honest it is scary stuff to make a proposal based on something you have not even used yet, but it does focus the mind no end. I started off with my 64-bit Oracle 11g database on Oracle Linux and decided to go with the SH schema as it was quite big, then ran into a few problems and bugs, like it did not seem to like building a cube for data belonging to another owner, then the ‘A’ OLAP patch came out, but not (at that time) for 64 bit Linux, so I decided to build a 32 bit VM and go 32 bit and apply the new patch for the server side, and the new AWM client. Of course this takes a while to do. And then I still had a few problems (new ones!) which needed to be worked through. I decided to have a look at the stuff Mark wrote about in his initial glance at Oracle 11g OLAP and repeat his work with the global schema - this too was not straight forward as the schema download that I found needed a bit of 11g-erising to get the schema load scripting to work. After finding the last few remaining privileges to grant I succeed to build my first cube organised m-view. But query re-write eluded me, perhaps because the ‘A’ patch has fixed the CBO to not over favour cubes so much when straight relational is faster.

So, I have gone back to SH and managed to get the cube built, and rewrite to work too! - so more on that another time.

Posted by: Peter Scott | February 16, 2008

OWB map testing

I am still overseas working on a OWB project for a client. We are now into that final phase of checking everything before we promote the first release to the production system. Each OWB map for the first release is redeployed on the development system and then test data run through the map - we then check that the right number of rows (and content too!) appears at the right target. This is painstaking, but necessary work. Where we have problems we track them down, resolve (redeploy if we need to alter the mapping) and then repeat the test until the problem goes. When we get all of the code in the first release working, we go through the loop again for all of the maps to see the affects of using the SCD2 functionality in the dimension load plug-in (and yes, we have applied that patch to remove the horrendous “compare effective dates as text strings” coding that somehow crept into OWB) We are looking to make sure that execution times don’t go off on the update rather than pure insert operations of the first round of testing

Which brings me to a nice OWB 10.2 feature in trouble-shooting complex maps. The ability to generate code for indivdual mapped object input or output groups; Click on the code generator icon on the map editor and click on a drop down on the code window to change the mode then click on a mapped objects grouping bar. If I loose one or two rows in a mapping (or even all of them) I just cut and paste the generated code into SQLDeveloper and check that each output is doing the right thing

Posted by: Peter Scott | February 5, 2008

Oracle Data Integration Suite

Keeping my head down building Oracle 11g OLAP cubes for research and self-education meant the I missed yesterday’s product announcement from Oracle, but with the wonders of Blog aggregators (and in particular Beth’s) I spotted a mention on Vincent McBurney’s blog of the newly announced (and available)  Oracle Data Integration Suite.

This is one of the fruits of the recent purchases by Oracle of Hyperion (Data Relationship Manager), Tangosol (Coherence *) and Sunopsis (Data Integrator) and with a bit of Application Server, BPEL and Enterprise Service Bus thrown in and the ability to use an embed data quality and profiling product from Trillium *

* Coherence and the data quality options are add-on to the base ODI Suite

This looks interesting, I might write more on this later

Posted by: Peter Scott | February 2, 2008

Home is the sailor, home from the sea

Well, not the sea and not sailing, but I am back in the UK for a week or so as sort of mid-assignment break. In this case break does not mean vacation, but more of pause in the assignment. This is very welcome as it will allow me a few days to break the back of the paper I will be presenting at Collaborate 08 in Denver. Bravely, I decided on a completely new topic and completely new material so I have a bit to do it get the paper submitted by the end of this month and the slide deck a couple of weeks later. I am quite excited by the prospect of speaking to a new audience and already I am working on the jokes for the presentation.  Of course, the other thing about making the trek across the Atlantic is that I will to hear speakers that I have come across here on the Internet but never have had the opportunity to listen to; maybe I’ll get the chance to speak to them…

The agenda is looking very good and there are a lot of sessions that really appeal to me, and as usual with a muli-track event there are a few clashes, but with some careful planning with my colleagues (Mark Rittman is presenting on the Monday, and Borkur hopes to get along too) we should be able to arrange not to miss too much.

But it is not all playing about with Oracle 11g OLAP in the next few days, Mark has other plans for me…

So what’s been going on in the past few weeks?

Well for me, not blogging a great deal. I have been working with an overseas customer and the hotel I have been staying has unreliable Internet connectivity, and coupled with the need to complete an article for an web-newsletter (more about that when it is published later this month), the need to write some course slides for a data warehouse design course and simple things like getting food (my hotel was not too near the restaurant quarter and going out at night meant wrapping up very warmly) did not leave much time to blog. There are some more data warehouse and data quality posts in the pipeline, but I suspect that the Collaborate presentation will jump to the top of the queue for a while.

Speaking about the blog, I am thinking that the time has come to split the random ramblings from the more technical posts (not least so that serious feed aggregators don’t get swamped with the so-far-off-topic stuff that I am renowned for. I have a good idea of what I will do, but need to road-test the idea first with few people, so watch this space.

Posted by: Peter Scott | January 27, 2008

Working together

Half of November, all of December and most of January. My time with Mark and Jon at Rittman Mead Consulting has certainly flown, but that’s probably because I am enjoying the job

In my old role, I was regarded as the company BI expert (or at least that as far as the UK went) and in some ways this was quite isolating. Here things are different.

All of us at Rittman Mead are actually quite good at what we do, but we are different people with different experiences, sometimes these experiences overlap sometimes they don’t. So one of the things that really is good about working here is that it is not just one consultant and one consultant’s ideas; we bounce ideas off each other, ask each other about best practice and seek out “the hands-on done it” from the “read-it in the manual” experience. This makes work really enjoyable as you have that comfort that your colleagues are there for you, which in turn allows you to deliver a better service to your client

Posted by: Peter Scott | January 23, 2008

Quality thoughts as I continue to chill out

Over at another Blog, Beth notes that she is editor of the month for the Carnival of Data Quality. When It comes out I urge you to take a look. With those strange quirks of global IT companies Beth and I once worked for the same employer, but we have never met - I did see her photo once on the internet and if she donates to charity I’ll tell her where!

So what has data quality got to do with BI? Everything, For a BI system to be successful you need to fulfil three objectives:
•    It must tell people what they need to know - by that I mean it must encompass enough detail to have real use
•    it must tell people what they need know rapidly enough
•    and it must tell people the truth
The last item is something that has to be designed in from the beginning of a BI project; performance and scope can be enhanced later but quality can’t, or not without having to replace already loaded data.
When I get involved on an ETL project I probably spend less than a quarter of my time on building ETL code and getting to it run, the majority of my time goes on data quality and finding and explaining anomalies. Where we can, we get data fixed at source, but sometimes that is not going to be possible; but data profiling allows us to formulate rules to handle the expected exceptions (whether these are auto-fix procedures or park it one side and let a human make the call rule) And then there is the unexpected exception, which of course must be handled.

Posted by: Peter Scott | January 14, 2008

OWB quirkiness

Recently, I mentioned a quirk in OWB 10.2 with a dimension table target configured as a type 2 slowly changing dimension. We have been working on some load performance problems with straight-forward OWB maps. As part of this investigation we wanted to change some of the hints being used in the mapping; for a start we wanted to remove the parallel hint. This is relatively simple to do (in principle) go into the mapping configuration, chose the the dimension operator and then go into the load hint window and make your choices. However the choices don’t stick and a few moments later they are back to the default APPPEND & PARALLEL. On the other hand, OWB 11g has no default load hints on dimension operator.

Oh, one final thing, if you are part of a team developing OWB maps check that all of your development workstations are using the same JAR file versions, least the code you deploy works (and performs) differently to that of your colleagues.

Posted by: Peter Scott | January 14, 2008

Last Man Standing

Seeing all of those tag blogs going around triggered a sense of anxiety in me. The vain part was ever hopeful that I actually am well known enough to have someone single me out to ask those questions that bare my soul. The shy part is just glad that no one had found me yet. But the fact that this is published indicates that someone hates / likes me enough to call me out. And with a sense of inevitability it had have been Lisa who chose to tag a F-list celebrity like me (a much lower level of celebrity than the great APC who was singularly tagged by Mr OTN, Justin Kestelyn)

So what is not known about me? Well a lot of obscure stuff has already appeared in the back pages of this blog or even on the about me tab. So that leaves me with two main options, make it up or rehash the old material. Or I could just write 16 ‘facts’ and leave you to work out the wheat from the chaff. The other point is that my ‘uniques’ aren’t that unique - I sympathise with Niall about arguing over the meaning of words - my wife has been known to force me out of bed to confirm or deny a word derivation in our handy, bedroom copy of the Shorter Oxford Dictionary, fortunately this two volume edition is able to resolve most of disputes - the tricky stuff comes when she insists we go to the whole dictionary.

  • I left University (for the first time) when I was 12. My father worked in one of the University of London colleges and had a ‘grace and favour’ apartment in return for supervising one of the college student houses. This was not a great deal of trouble as in those days the college was for woman only and men had not been invented. We moved on after my father died, which of course is the downside to tied accomodation
  • My aunt taught me to ride a bicycle one summer when I stayed on her farm, she did not teach me to get off though, this was achieved by controlled (?) falling into ditches
  • I went to primary (elementary) school with Emma Thompson (but if you ask her she won’t remember me, perhaps because I was a year or so ahead of her or she was the creative type and I was the numbers type)
  • Between being a research scientist and working for a IT consultancy I spent a while working in pharmaceutical marketing consultancy doing odd things like writing award-winning computer games (Doug is not the only one paid to code games, but I never used a Spectrum) for exhibition stands and touring places such as Pakistan running medical education road-shows for doctors.
  • I used to be a quite good (club level) long distance runner, but over-flexible ankle joints put an end to serious running
  • I used to be a bell ringer (but was never as good as my wife)
  • I have been told that I am a good cook and should open a restaurant, and a humorous writer and should become an author, Neither actually sound that feasible as career choices. But I could combine both, cook books and become an accountant
  • I talked blindfolded colleagues into stepping of cliffs! - this is called leadership training. (they were wearing safety gear and no executives were (very) hurt in this training process)

Now who to tag in return, well being the Johnny-come-lately there are few people that have not be shamed into writing - I choose…

no one in particular (or at all) as I must be last man standing.

Posted by: Peter Scott | January 10, 2008

Almost a technical post… almost

Well, I had this technical post about a quirky feature of deploying OWB 10.2 maps for slowly changing dimension loads and perhaps mentioning how OWB 11g has a different default behaviour, then I remembered a discussion I had with my colleague, Bökur, a couple of days back about hints. This of course does nothing to change the fact that the default behaviours are different but may influence the steps I need to take to fix the problem; so I think I will need to investigate more, so that post will be delayed a little

I also had this post about “Numbers: number or VARCHAR2″ and the dogma wars about “if you are not going to do maths on it shouldn’t it be a character string” and “numbers are better for between predicates that strings” with that side argument of whatever you do make sure the users do the same thing or the indexing won’t work. But I thought again and decided that there was enough controversy without stirring it again

Although not posting much recently, I have tried to keep up to date with the blog world, I have free internet access from the lobby of my hotel and although not a good place to post anything longer than a few lines it does allow me to at least read  my favoured writers - and I do read a lot more widely than many people think. I noticed that blog tagging has rampaged through the community, and that eventually I would be tagged (and late in the day as I am not mainstream) so thanks to Lisa, I have to decide whether I should list obscure facts about myself, or just point out that this blog already contains enough fact about me if you know where to find them.

« Newer Posts - Older Posts »

Categories