Approach Big Data Analytics Like a Lego Kit

A few months back I was having a conversation with a colleague of mine, Brad Elo. We were discussing the importance of operationalizing analytic processes and the need for the use of repeatable and standardized components to enable success. As part of the discussion, Brad brought up a terrific parallel between Legos and analytics that clearly illustrates the importance of approaching the analysis of big data correctly.


The Lego brand is a powerful one today. The Lego brand is behind not just popular building kits, but also television shows, movies, video games, and theme parks. It is hard to believe that such an iconic brand that has successfully innovated over the years came very close to bankruptcy a few years ago (see here and here).

When it comes to Lego’s pre-packaged building kits, there is one aspect that ties closely to the needs of big data analytics. Namely, Lego provides consumers kits that use a combination of both custom pieces and standard pieces to create awesome models which children (and some adults!) can’t wait to get their hands on.

I’d like to focus on two areas where analytics processes can borrow from the Lego model: reusability and increased adoption.


When I was young, I had a fairly large collection of Lego pieces. However, today’s fancy, pre-packaged kits didn’t exist and so my time was spent either building my own designs or building a suggested design from a Lego booklet. What I loved most was that the same base pieces could be used to build an incredibly wide range of things from forts to boats to spaceships. At the same time, I had a fairly limited number of unique piece shapes which meant that all of my creations had a similar, boxy look to them.

Today, Lego kits still utilize a lot of standard pieces. However, they also include pieces customized for each specific kit: cannons for a fort or a flag for a pirate ship or a special curved piece for a spaceship. By mixing standard and custom pieces, Lego enables us to build things that look more realistic and unique while still keeping their manufacturing process manageable. Perhaps most important to Lego, the kits entice people to purchase many of the same base pieces again and again. Even if my collection has virtually all the components of a new model, I’ll likely buy the full kit to keep things simple.


This is what organizations need to strive to do with analytics. Instead of making every analytics process a totally unique one, focus on building reusable components that can be snapped together with some additional pieces customized for the new model. For example, a wide variety of analytic processes might need a customer lifetime value (LTV) score to be incorporated. Don’t build a new LTV model for each process. Rather, build a single LTV model and let all the other processes access and incorporate it.

By focusing on reusable analytic components, it becomes possible to build a larger portfolio of analytics faster. In addition, upgrades to a given component will automatically flow through to all the processes using it. In our example, an improved LTV model would flow automatically into downstream processes utilizing it. This saves time and reduces risk. In addition, because many pieces of the process are readily available, effort can be focused on the customized logic required for each new process to builds off of the available components.


Lego kits also vastly increase the number of customers who will build any given model. Even if kids have the building instructions and know they have all the right pieces laying around in their collection of 1,000’s, it is a painful process to locate them all before starting to build the model. By purchasing a complete kit, kids know they’ll have everything they need and can focus on the fun of building instead of the tedium of finding all the component pieces.

In this way, Lego sells more kits and total pieces to customers. This is good for Lego but also good for the customers who get to build more models than they otherwise would. It isn’t nearly as intimidating to start building a pirate ship with detailed directions and a complete set of components to start with as it is to try and build one from scratch.

Creating reusable analytic components similarly will increase the adoption of analytics. If business sponsors know that a new process can be compiled largely from existing components, they will be more likely to sponsor additional efforts. Those efforts, in turn, may yield additional components for future projects to build upon. The virtuous cycle will lead to more analytics processes being built than otherwise would have been, all of them then used to drive business value.


In the era of big data, we have ever larger data sets, increasingly complex analytic requirements, and constantly increasing demand for new analytic processes. Without applying discipline and focusing on increased reusability and adoption, organizations won’t be able to reach the scale required for their analytics.

We can’t afford to process data more than we have to and we can’t afford to create (or duplicate) more analytic components than necessary. Creating a base of standardized analytic components will keep costs and risks down while driving increased adoption and breadth of analytics over time. That is a winning combination!