Plan For Big Data Like It's 2000

Remember 2000? Everybody was very happy with the fact that Y2K had passed without nary a glitch, and the e-commerce revolution was in full flower. Most of the attention went to e-commerce startups, but IT-oriented people my age and older also recall that many larger firms were attempting to embrace e-commerce. While I was only on the fringes of that movement, in many cases a consultant or academic like me would be brought in to help a company strategize about this important new capability.

It feels like that again now, and this time the focus is on big data. A lot of companies are aware that big data and analytics offer them considerable opportunity for competitive advantage or parity. They glance sideways at Google or eBay or Facebook and say, “Are they going to take over my business?” They peek at GE, and say, “Should I be making a similarly bold investment in data and analytics?” Many organizations are now trying to familiarize their executives with the big data possibilities, and to develop a strategy for how to proceed with it.

What kinds of activities and decisions should a company pursue as it wrestles with its big data strategy? I see two major decisions at first, and then several others that follow from them. I’ll use Monsanto as an example, since it is a company that is clearly moving from being a provider of seeds and herbicides to one that provides data and analytics-based products and services.

The first two involve data and business opportunity: at their intersection lie the activities to turn data into money. Whenever I ask successful big data users whether they start with the data they have on hand, or the business needs that they have, they always say “yes.” So you need to jointly evaluate these two issues, and hope that they meet in the middle.

The data question involves a quick inventory of what data resources a company has on hand, and what other related, important ones it could get access to. At Monsanto, for example, the company already had lots of data on plant hybrids and their growth under various conditions; it had been accumulating that data for decades, and it was well-stored and well-understood. What the company lacked was detailed, highly granular data on soil and weather—the other factors necessary for plant growth. Monsanto found a good source of field-level weather data in a Bay Area startup called The Climate Corporation. Managers bit the bullet and bought the company for $930 million last year. You might guess that if there were a similar company in the soil space, Monsanto would probably buy it too. Indeed, it did that in purchasing the soil monitoring business of Solum, Inc. this past February.

The business need involves the key issues that a company or its customers are facing that might be addressed with data and analysis. The business need at Monsanto is to help their traditional customer—large farmers—become more productive. They saw the opportunity to help growers with “predictive planting”—the exact combination of moisture, seed, soil composition, seed depth and density, and planting and harvesting time to optimize crop yields. Agriculture has become a complex and precise business, and farmers need help with it. The strategy seems like a great idea, although it’s early to know whether and how much farmers will pay for such advice.

Many other aspects of a big data strategy flow from the answers to those two questions. How should you organize the resources to meet those business needs with the identified data? Monsanto felt that it would be easier and more effective to move all existing data and analytical products into the Climate Corp. organization than to try to build the new capabilities within its St. Louis headquarters. It’s also probably easier to find new data scientist talent in the Bay Area. The soil composition acquisition was also moved into Climate Corp.

The right technologies to adopt also often flow from data and business need decisions. Perhaps not surprisingly, Monsanto felt that the volume and speed of data processing necessary to succeed would come primarily through Hadoop. One crop in one season in one country generates 20 billion data records, and yield monitoring sensors generate another 14 billion—so some considerable parallel processing horsepower was necessary. Climate Corp. already had expertise in Hadoop, so continuing to use it was not a tough decision.

There will be many other ongoing decisions that companies like Monsanto have to make in fleshing out and executing on their big data strategy. But many of the key components seem to be in place. Other companies might decide to build the necessary capabilities internally rather than acquiring them, but that would take much more time.

It will be interesting to see how big data strategy decisions work out over time in comparison to e-commerce programs. I think most established companies got value from putting an e-commerce strategy in place. In e-commerce and in big data, the only big mistake seems to be to do nothing.

Originally published in WSJ’s CIO Journal.