Data Management: Data Warehousing and Data Mining

Data are newly fashionable. In fact, they are so fashionable that they have spun off their own applications subcultures, including the domains of Data warehousingand Data mining”. This pair of seemingly odd metaphors makes up an increasing component of contemporary business intelligence. Understanding the implications of these new capabilities for organizations is a a major challenge. Just because something is technically possible doesnt make it mandatory -or does it?

There is no question that new data processing and analysis capabilities have fundamentally changed the way that managers understand and use the data resources of their organizations. Not since the 1870s, when the file folder and the vertical filing cabinet replaced the ledger book as the record-keeping media of choice, have we seen such changes in the data structures and processing arrangements of large organizations. As the graphic suggests, they are made possible by the convergence of three distinct (although not wholly independent) technical and organizational developments. These new systems are also expensive, controversial, esoteric, and sometimes hard to explain -all the characteristics of the leading -not to say bleeding -edge of information technology management.

Is Data warehousingonly a fancy name for never being able to throw anything away? Well, there are elements of that. Data management and data storage capacities have grown exponentially in recent years. When information storage was a high-cost item, there were incentives to cull ones store of retained things; today, storage costs are often if not always less than the costs of culling. Interestingly, smaller organizations have always had a closer rein on their data, simply because they had less of it and their people could stay closer to it. Now, the largest organizations can potentially cuddle up to their databases just like Mom and Pop, Whether they actually do so, or just allow their data warehouses to become the equivalent of Grandmas attic, is a strategic decision (or non-decision) made (or not made) in many ways and at many points in time.

Under these circumstances, it makes more sense to invest your resources in storage media and archivists, than in intake clerks and gatekeepers. But then, of course, youre stuck with all the data -and the temptation to use it can become almost overwhelming. The how its used is frequently Data mining-the down-to-earth, not to say down-IN-the-earth, term used to describe a collection of approaches and techniques used to extract useful interpretations from often intimidatingly large and previously unmanageable quantities of organizational data. Kurt Thearling describes it as The automated extraction of hidden predictive information from databases”.

While these areas are technical, they clearly arent just technical, and are in fact major socio-technical challenges to the organization. In this Module we examine some of the key technologies and tools used in these process, consider their costs, benefits, and effects, and contemplate their place in the organizations overall information technology strategy.

