Saturday, November 12, 2005

Master Data Management

There is a new term in enterprise software - Master Data Management. While the term is new, the concept it not quite so new. I view it as the end point of a change that has been going on for some time.

The concept of an Enterprise Data Warehouse emerged during the 1990's. One of the compelling reasons for creating an Data Warehouse is to create a single version of the truth. An enterprise has many different IT systems, each with its own database. The problem is that each database has its own version of the truth.

So for example, consider a typical enterprise that has many customers. It will have marketing databases and several sales databases, each associated with the IT system for that sales channel. There are also service systems with their associated databases. The same customer may appear in several of these databases that support the business operations and in each one the customer information is different. There are variations in the name, different addresses and phone numbers or in some cases no contact information at all. While this example is about customers all other enterprise information also exists in many databases, and anywhere that information is multiplied, there are bound to be contradictions, failures and gaps.

Rather than try to resolve this mess, the idea of the Enterprise Data Warehouse is to create a new database that contains the clean and corrected version of the data from all the operational databases. So the Enterprise Data Warehouse is the one data repository for a single version of the truth about the enterprise.

In the early days, the Data Warehouse was conceived as a place for business analysts to do their work. Business analysts like to ask difficult questions and another reason for creating a separate database is to give them a place where they can run complicated queries to answer these questions without disturbing the operational systems.

In practice, building the Enterprise Data Warehouse is difficult and expensive, and the result is an immensely valuable resource. Far too valuable so be left to the business analysts. So from the earliest days of data warehouses, they were connected to operational systems and used to help run the business.

The problem with this is timing. The original data warehouse was conceived as something that you could load at night with the days data and then the business analysts would query it during the day. However if you are running a call center off the information in a data warehouse because it has the best version of the data, the data warehouse has to contain the latest information.

For example, when a customer calls to complain that the product they bought that day does not work, the data warehouse needs to have been updated with the purchase so that the call center can verify it. This has led to the notion of a real time data warehouse. I explained this when I gave a presentation on "Real Time Business Intelligence" to the SDForum BI SIG in 2003.

So what does all this have to do with Master Data Management? Well I view Master Data Management as the legitimate name for this trend that I have described as real-time data warehousing. After all, real time data warehousing is not a very good name for the concept. It is a description of how we are getting there, while Master Data Management is a statement about the end point.

Of course the real story is not as clean as I have described. Enterprise Information Integration (EII) is about building a virtual data warehouse without bringing all the data together in a single physical database. Master Data Management does not necessarily imply that all the data is integrated into a single database, all it means that there is a single version of the truth about all the enterprise data and that this is available to all enterprise applications.

It is also worth noting that as with any new term there is still a lot of discussion and debate with different points of view about what Master Data Management really means.

No comments: