A while back it was quite feasible to draw circles around discrete databases in an organisation’s IT structure and say ‘this is the data warehouse, here is the billing system and that is the blinkity-boo system. But now those circles are pretty defuse. It is harder to differentiate between where document storage diverges from data warehousing, ERP systems and electronic content management and workflow. Federated databases, messaging (both XML and older queue technologies) mean that not everything is one place; the challenges of providing a unified view of information becomes so much harder. Reporting is no longer about finding the sum of past activity, we are increasingly looking for predictive measures and trying to find patterns. This is happening everywhere, in commerce, finance and especially government.
Before I started out with data warehouses I did a lot of work on analyzing data, that is searching masses of information looking for links between data items – seeking out new and unknown connections, often connections separated by considerable time delays. To make this work I had to impose structure on the unstructured and this meant borrowing techniques from all over; artificial intelligence and fuzzy logic, network theory and even astro-physics (not all of my data sources were textual!). Borrowing continue apace, and now there seems little to delineate technologies. What I used to do then has become common and not just in the traditional DW monoliths of the past: Real-time credit card fraud detection is being pushed out to the point of sale by tapping in on the XML streams back to the centre, events recorded on different systems can be linked through messaging and not just rely of them being transported to a single data repository and being found by some batch process running way after the event has significance; as I said in an earlier post, the number of executive choices after an event diminishes as time progresses – there will be a stage where there is no longer a choice.
Maybe we are heading to future when the disciplines are no longer ERP and OLAP or transactional and decision support but become just Data Storage and Data Retrieval