Posted by: Peter Scott | February 7, 2007

Working with materialized view summaries

Materialized views (mviews) have been part of Oracle for along time, before Oracle 8 they went under the name snapshots and indeed some of the current dictionary tables refer to the old name. Common uses of mviews are to replicate data and to summarise data. It’s this second use I’ll write about here.

Firstly a small rant about words associated with materialized views.

  • They are not views, they are physical tables that take storage, and there is no such statement as create or replace materialized view foo as …
  • The word “fast” as is fast refresh is not always (possibly, never) true.

A good source of notes on mviews in a DW or BI context David Aldridge’s OracleSponge blog. In my opinion you could stop reading here and go straight to him.

Oh, you are still here.

Mviews for summary management is a still evolving subject. The things that can be accomplished with mviews has increased version on version since those 8i days at the end of the last century; query rewrite has become more and more sophisticated and degree of complexity that can be accommodated by fast refresh as also increased.

Creating MVIEWS

Oracle gives two fundamental options to create a mview; create the whole object in the create (including allocating its storage) or registering an existing table as a mview by providing just the query definition and refresh clauses. I favour using pre-built tables and registering them as it allows me to separate thinking about storage (and partitioning) from the query definition used for maintenance and query rewrite, it allows me to be more exotic in the way I structure tables – I have even used Index organised materialized views in the past. But perhaps the most important thing is that it allows you to drop a mview with out loosing its content, and for big summary this could be important. I hear that 11g will have an option to drop a mview without the loss of the underlying table, I do hope this feature makes production

Maintaining MVIEWS

The other area to consider is how the mview is to be refreshed. This basically falls into three options on how it is done: NEVER which means that you have to do it yourself, COMPLETE which as the name implies replaces all of the content of the mview using the defining query as the SQL, and FAST which adds new information and updates changed information (in reality, there is fourth option, FORCE which is also the default – it is “do fast if possible otherwise complete”). You also have options on when it refreshes: ON DEMAND is completely in your control, ON COMMIT refresh is triggered when a change is committed to the defining tables and scheduled using the START WITH and NEXT clauses. I am firmly convinced that the only sensible option is on demand, on-commit is plain silly with fast changing OLTP sources or even batch processes with multiple commits for the various source tables. Scheduled refresh only uses time to control the activity – this could be a problem if the refresh is fired off before the underlying data has been loaded

As I mentioned before fast is not always quick as we have to go through several hoops to implement it and at the end of the day we are applying updates to many records. We need to implement mview logging on the source tables to detect changes, which will add an overhead to maintenance of those tables. We need to add extra columns to materialized view to contain various counts and sums to allow the speedy calculation of changes and we need to mark that the logged change has been applied. And of course this is a row-by-row change which will contraindicate the use of features such as compression and readonly tablespaces. Also we have no real concept of how much change is to be applied in the update is this just going to affect ten rows or are we going to hit every row in the table. Finally, fast refresh can not cope with complex SQL, it just will not work in every case.

I also do not like complete refreshes. In a DW most data is historic and unchanged so why rebuild it? The complete refresh is all or nothing – it will delete the existing rows (truncate in a limited set of circumstances) then build the whole thing as a single statement with of course massive impact on the storage needs for sorting the intermediate result set, it does not know about the expedience of disabling bitmap indexes until the build is complete (and even if did it could get caught out by a quirk of truncate table and disabled indexes) So for me the only way to go is DIY

About these ads

Responses

  1. Pete – Thanks for the great post!! exactly what I needed to know about refresh options with mviews.

    Thanks again!!!

  2. Great stuff, Peter. One question: for the DIY, do you prefer using SQL scripts or do you package the whole operation into PL/SQL and fire it off when needed?

  3. Noons:

    I’m going to say neither! (or both)

    We store the SQL to generate the summary, or more properly a slice of the summary in a table in the database and call a package that for a given materialized view name and time slice execute the stored code. The time slice is the partition to summarise for partitioned tables. If the table is partitioned the first step is truncate partition otherwise it is delete from.. where date = time slice

    The code we use to build is not always the same code used in the mview definition and we also store other bits of metadata with the query code such as build order (useful when we nest summaries).

    Using this approach we only refresh the dates where there has been change.

  4. Hi Pete

    Had been a silent reader (dont speel loud ) of your blog for over a year now and have come to fall in love with your writing style (afterall feb is month of love ) wundering if you ever want to write a book on Datawarehousing if not for the content i would certainly buy on for its humor :-).

    on a non serious note just wundering does your shop extensively rely on pre -built summaries that are registered as materialzed views

    Any plans for bloggin on netezza(sounds like pizza or net pizza to me) and i read from various blogs that they dont use materilaized views and companies like carphonewarehouse vouch for that.But surprisingly when i enter netpizza (or was that netezza) website they mention in the courseoutline use of materilaized views again ..hmm..what is the true story..

  5. Hrishy:

    I had thought about a book on DW – We (my co-author and I) were talking to a publisher about 18 months ago, but our editor left and we decided with the new editor that the book was a little bit too ‘broad’ to sell. I might still write something on design and architecture though or maybe tuning. The problem is that my style is sometime too off the wall for serious technical readers ;-)

    I prefer to register mviews on pre-built tables as it gives me a bit more control of how things are set out and built; building a 100 milion row materialized view in one hit needs a somewhat large temporary tablespace to handle the inevitable sort. But I have used both sorts of mview

    Netezza Materialized Views are not the same as Oracle’s. They are a way of reducing IO by ordering data within a table and using a ‘zone map’ to find the zone(s) that contain the data of interest. Of course this approach is best for queries that aggregate the raw data on just one or two columns, the more columns you look at the less likely the rows will be clustered together.


Categories

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: