Pete Scott’s random notes

Database problems

Posted by: Peter Scott on: January 20, 2007

One of our customers is on Oracle 9.2, we would dearly love to move them to 10g but for certification reasons with their chosen query tool we can’t upgrade them until the tool is also upgraded which in turn is linked to the version of JVM associated with their terminal server environment. We upgraded the development database environment to 9.2.0.8 back in November, but as the customer is in retail and the holiday season is the peak sales period we were not allowed to upgrade the live system until this week.

We have been looking forward to moving to the final patch set of 9.2, it fixed a few problems that afflicted some data warehouse users and was also claimed to resolve a horrible core dumping problem with certain star transformation queries.

The next day problem calls started to roll in; some, but by no means all existing reports, started to fail with errors relating to missing arguments to functions. So my team went in to fault finding mode.

When you provide support as a service to customers the first priority is to restore service, that is to identify a work around to minimise impact on users. Finding the root cause and fixing can come later if people are able to work.

Finding example queries that fail is not trivial, it is not like slow queries or wrong results problems where we can capture the query easily in the database, these queries fail to parse. Fortunately, the user tool has an option to show display the SQL being generated. The SQL captured is not very elegant, it uses lots of to_char conversions and a ugly looking in list. As the data warehouse uses materialized view query rewrite we decide to disable query rewrite in an SQL session and capture the query plan. As this gave a plan we conclude that the query is rewritten before it fails. We turn query rewrite on again and replace the in list with a single value, another successful plan captured, this time rewriting against a suitable table and using a temporary table star transformation. The next stage is to replace the single item with a one item in list and then keep adding items until the query fails to parse. Strangely, we get failures as soon as we have three values in the in list.

In the past, temporary table star transformation have had a bad press, there is even an option not to use them in the setting of the STAR_TRANSFORMATION_ENABLED database setting. So perhaps this is the problem happening here. Switching of the transformation completely we now get queries that complete and still in acceptable time; we now have a work around that will stop the failures whilst we continue to investigate the cause and prepare a simple test case for support

3 Responses to "Database problems"

Did you completely disable star transformations Pete, or just the temp table bit? I remember that causing problems before.

For now completely disabled. We are working up a test case that will check whether we can use star transforms without the temporary tables.

I have a feeling that we are pushing star transformations a bit too hard – we are using the transformations on snowflake schemas three or four levels deep (don’t ask!)

By taking a set of failing queries we managed to establish that the problem was associated with star transformations that used temporary disk storage (the in-memory ones worked OK). Oracle support are working with us on filing a bug report.

The work around of no star transforms caused severe performance problems, and the more acceptable work around of disabling the use of disk based transforms highlighted a parallel query bug, for a small minority of problem queries.

We’ll get this one cracked… soon???

Leave a Reply