Big Data Isn’t Analytics
In the 20th Century, progress was driven by the principles of automation. The more you could streamline and control a process, the cheaper you could make it. Repeatability was the key.
Monitoring, in the form of process control charts, spreadsheet generated graphics and staff meeting gotcha reports, was the way an enlightened executive ran his business. Progress was a matter of making the line move up and to the right. Cost cutting involved moving it down and to the right. Statistical process control (SPC) minimized variation.
Here’s a reasonable definition of SPC.
Statistical process control (SPC) is the application of statistical methods to the monitoring and control of a process to ensure that it operates at its full potential to produce conforming product. Under SPC, a process behaves predictably to produce as much conforming product as possible with the least possible waste. While SPC has been applied most frequently to controlling manufacturing lines, it applies equally well to any process with a measurable output.
Key tools in SPC are control charts, a focus on continuous improvement and designed experiments. Much of the power of SPC lies in the ability to examine a process and the sources of variation in that process using tools that give weight to objective analysis over subjective opinions and that allow the strength of each source to be determined numerically. Variations in the process that may affect the quality of the end product or service can be detected and corrected, thus reducing waste as well as the likelihood that problems will be passed on to the customer.
With its emphasis on early detection and prevention of problems, SPC has a distinct advantage over other quality methods, such as inspection, that apply resources to detecting and correcting problems after they have occurred. In addition to reducing waste, SPC can lead to a reduction in the time required to produce the product or service from end to end. This is partially due to a diminished likelihood that the final product will have to be reworked, but it may also result from using SPC data to identify bottlenecks, wait times, and other sources of delays within the process.
Process cycle time reductions coupled with improvements in yield have made SPC a valuable tool from both a cost reduction and a customer satisfaction standpoint.
- Increasing Volume (amount of data)
Generally speaking, the first big data problem is dealing with the amount of data. The important point about the volume of data is that it’s bigger than current toolsets can handle
- Velocity (speed of data in/out)
Historically, data has been processed when possible, not in real time. Companies like Google, Facebook, Twitter (and the rest of the high volume, real time data processors)
are perfecting the art of handling information as it emerges.
- Variety (range of data types, sources)
In some ways, this is the biggest piece of the puzzle. Big data is a way to make novel correlations that create insight that wasn’t before possible by integrating disparate (and what used to be seen as unrelated) data sets.
In the end, though, what matters is the ability to see patterns in the data. Big Data is really a way of talking about the challenges and opportunities that emerge from the data that is drowning us all. It’s not about building new and larger process management tools, it’s about mining the insights that can explode productivity in systems that have fewer and fewer repeatable processes.