Can We Master Data Management?

About 20 years ago, there was a movement afoot in the IT industry called “Information Engineering.” Promulgated by vendors and consultants—at first those trying to sell to the government, and then those seeking corporate bucks—it was one of those massive data planning exercises that never got off the ground in most organizations.

The U.S. military, for example, spent hundreds of millions and perhaps billions of dollars on information engineering, without ever really accomplishing anything. One Xerox executive told me, “We tried information engineering for 20 years without ever getting close to succeeding. We always thought we were doing it wrong. Eventually I came to suspect that there was something wrong with the approach.” So did I. I wrote the book Information Ecology: Mastering the Information and Knowledge Environment to counter the ideas in information engineering. It wasn’t a huge seller, but I have to say it was a pretty good tome.

Information engineering seems very similar in scope and objectives to Master Data Management (MDM). According to Wikipedia (the only dictionary source, because Encyclopedia Brittanica and Webster’s don’t weigh in on such an arcane subject), MDM “has the objective of providing processes for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing such data throughout an organization in such a way as to ensure consistency and control in the ongoing maintenance and application use of this information.”

That gerund-rich definition suggest that MDM is a big hassle, which would be a correct assumption. Even worse are the steps Wikipedia suggests to address MDM: “Processes commonly seen in MDM solutions include source identification, data collection, data transformation, normalization, rule administration, error detection and correction, data consolidation, data storage, data distribution, and data governance.” Fortunately, they don’t recommend “data modeling,” which was perhaps the biggest and least productive time sink in the information engineering movement.

I am sure that there are some benefits to MDM, but they are not easily apparent. Clearly there is a need for data integration and governance, but doing it on such a broad scale makes me very uncomfortable. Aggressively pursuing MDM leaves an organization open to massive confusion from senior executives about the purpose and value of the exercise. Let me also point out that if you dedicate all your efforts to MDM-derived data integration, you won’t have a lot of time left for running chi square or logistic regression analyses. Like everything else, data management should be consumed in moderation.

What has your experience been? Do you have an MDM success (or pitfall) story to share?