Materialized views (mviews) have been part of Oracle for along time, before Oracle 8 they went under the name snapshots and indeed some of the current dictionary tables refer to the old name. Common uses of mviews are to replicate data and to summarise data. It’s this second use I’ll write about here.
Firstly a small rant about words associated with materialized views.
- They are not views, they are physical tables that take storage, and there is no such statement as create or replace materialized view foo as …
- The word “fast” as is fast refresh is not always (possibly, never) true.
A good source of notes on mviews in a DW or BI context David Aldridge’s OracleSponge blog. In my opinion you could stop reading here and go straight to him.
Oh, you are still here.
Mviews for summary management is a still evolving subject. The things that can be accomplished with mviews has increased version on version since those 8i days at the end of the last century; query rewrite has become more and more sophisticated and degree of complexity that can be accommodated by fast refresh as also increased.
Oracle gives two fundamental options to create a mview; create the whole object in the create (including allocating its storage) or registering an existing table as a mview by providing just the query definition and refresh clauses. I favour using pre-built tables and registering them as it allows me to separate thinking about storage (and partitioning) from the query definition used for maintenance and query rewrite, it allows me to be more exotic in the way I structure tables – I have even used Index organised materialized views in the past. But perhaps the most important thing is that it allows you to drop a mview with out loosing its content, and for big summary this could be important. I hear that 11g will have an option to drop a mview without the loss of the underlying table, I do hope this feature makes production
The other area to consider is how the mview is to be refreshed. This basically falls into three options on how it is done: NEVER which means that you have to do it yourself, COMPLETE which as the name implies replaces all of the content of the mview using the defining query as the SQL, and FAST which adds new information and updates changed information (in reality, there is fourth option, FORCE which is also the default – it is “do fast if possible otherwise complete”). You also have options on when it refreshes: ON DEMAND is completely in your control, ON COMMIT refresh is triggered when a change is committed to the defining tables and scheduled using the START WITH and NEXT clauses. I am firmly convinced that the only sensible option is on demand, on-commit is plain silly with fast changing OLTP sources or even batch processes with multiple commits for the various source tables. Scheduled refresh only uses time to control the activity – this could be a problem if the refresh is fired off before the underlying data has been loaded
As I mentioned before fast is not always quick as we have to go through several hoops to implement it and at the end of the day we are applying updates to many records. We need to implement mview logging on the source tables to detect changes, which will add an overhead to maintenance of those tables. We need to add extra columns to materialized view to contain various counts and sums to allow the speedy calculation of changes and we need to mark that the logged change has been applied. And of course this is a row-by-row change which will contraindicate the use of features such as compression and readonly tablespaces. Also we have no real concept of how much change is to be applied in the update is this just going to affect ten rows or are we going to hit every row in the table. Finally, fast refresh can not cope with complex SQL, it just will not work in every case.
I also do not like complete refreshes. In a DW most data is historic and unchanged so why rebuild it? The complete refresh is all or nothing – it will delete the existing rows (truncate in a limited set of circumstances) then build the whole thing as a single statement with of course massive impact on the storage needs for sorting the intermediate result set, it does not know about the expedience of disabling bitmap indexes until the build is complete (and even if did it could get caught out by a quirk of truncate table and disabled indexes) So for me the only way to go is DIY