It’s generally accepted that by the beginning of the next decade the machine-to-machine market, or, if you prefer, the larger-scope “Internet of things,” will have more connected devices than all other Internet connections combined. Even if that tipping point proves to be off by a few years or, as some argue, we’re already there (depending on how you define M2M devices), the takeaway remains the same: very soon the M2M space will have an insane number of connections, each of which will be incessantly churning out data. In many ways this data is the real power of M2M. Big data is so integral to M2M success, in fact, that I recently read an opinion by another analyst that the “Internet of things” may very well become big data’s killer app.
As awesome as this sounds to the inner nerd in each of us, this massive data volume poses considerable challenges – so much so that the term “big data” doesn’t quite convey the workload ahead. I’ve seen it called “really big data” and “huge data.” Maybe it needs a more trendy term like “ginormous data.” Whatever you call it, the forthcoming explosion in M2M data has led to many friendly arguments about methods for managing it all, with experts from all quadrants advocating their favorite solutions. (Especially those with skin in the game.)
From 10,000 feet, you’ve got all the pieces associated with the exploding market for big data – the hardware/infrastructure, software and services components that are growing collectively and separately at double-digit compound annual growth rates. Within these segments we have lots of “this is where the magic happens” components – data centers and cloud computing and all the “stuff” we associate with plain old big data for today’s enterprise and M2M applications.
The reason M2M (and the “Internet of things”) will soon be big data on steroids isn’t just the number of connected devices but the types of connections and what they do – especially sensors. Sensors monitoring food and water for bacteria; cardiac patients’ heart rhythms; insulin levels in diabetics; employee locations throughout the day or how often a waiter tends to his customers; traffic flow on highways; inventory tracking in warehouses; filament levels in 3D printers; and more than we can possible list here. M2M’s future is all about sensors, sensors everywhere – for applications already developed and many more we haven’t even conceived of yet. These sensors will produce massive data streams.
It’s difficult to fully appreciate the strain this will put on data management. When it comes to storing the heaps of information that come out of M2M data feeds, it may not be the problem some anticipate as things are ramping up at a reasonable rate. A high-definition cat video of any length takes more storage space than ten conservative M2M devices might generate in a year, but, unlike a good treadmill accident caught in 1080p, the M2M data needs to be readily accessible to make it valuable. Furthermore, if developers don’t exercise discretion in defining what data must be collected/stored, independent applications may fill disk space extremely quickly.
Along this vein, stream management, or processing, or mining and storage, or data flow management, or however you label it, is where arguments become interesting. While viewpoints differ somewhat on the nuts and bolts, it’s loosely accepted that in many cases successful stream management will include: a) the ability to sort relevant data from junk or meaningless data; and b) the need to store the other data, including the junk data, in case the filtering is wrong or needs to be tweaked, or analysis from the essential or mined portions from the data stream dictates deeper analysis from larger portions of the stream, or for other deep dives that become necessary at some future date.
How to make all of this work is where we see lots of positioning. The infrastructure side gets lots of attention, but since all those sensors will ping out oodles and oodles of repeat data, we know that de-duplication and other data compression methods will play heavy roles as well. So will meta-data management solutions. And tiered storage. And analytics. And lots of other “ands.” The reality is that M2M will be an all-tech-on-deck exercise, right up to and including specialized solutions for industry verticals. There will of course be times of reckoning, with Darwin’s hammer stomping out weak or aging solutions. With enormous swaths of the space being software driven, the M2M space promises rapid, exciting and, at times, brutal evolution. But, for now, when I read about how this or that is the “key” to managing M2M data, I think the author may be right, but there will be many keys to making it all work.
The really interesting debate is in what to do when we have this processing down pat, and which method is more effective in leveraging that data for decision-making – machine logic or human intuition. But I’ll save that can of worms for another day.
This analysis originally appeared at RCR Wireless.