Big Data is a hot topic in the IT industry right now. Is it marketing hype or does it have real foundational importance? Like Virtualization and Cloud, two other categories which have dominated the IT airwaves in the last five years, Big Data isn’t actually all that new and in it’s broadest sense can trace its heritage back to the mainframe era.
The relational database vendors really created the concept of storing data in a structured and easily accessible way back in the 1980s and for most transactions, using Oracle, Sybase, DB2 etc was the way to go. But the RDBMS approach wasn’t overly suitable for really large objects or massive data sets, and so managing vast quantities of data became the preserve of companies like Terradata and the BI vendors for many, many years. They do a great job at it. Specialized hardware platforms like Engenio (now NetApp E Series) appeared to complement the software and deliver the right environment for complex data analysis of massive data sets. This was (and ‘is’) Big Data. But Big Data has now moved into the mainstream. Hadoop, MapReduce and other techniques for analyzing massive datasets are du jour.
Unstructured data is surely growing faster than structured data and will become the primary use case for storage. IDC predicted that back in 2008. But the exponential growth of Facebook, Google+, Twitter on the one hand, and the use of more and more rich media such as video surveillance feeds and user-generated video from flip cameras and the like, is surely what is driving this focus on big data even more acutely. There is a wealth of potential information that is tied up in social media data which is gold-dust for advertisers, economists, futurologists and the purveyors of a smorgasbord of services. Cher Aira, writing for SiliconAngle opines that combining big data with Twitter data leads to better customer service and better, more targeted advertising which she, as a twitterer, actually appreciates. Read her article here. She is probably right.
What is now being understood by big business is that the vast quantity of data that they store, actually contains real nuggets of information which can help drive differentiation. It’s the old addage about turning data into information. Data on its own is only so useful, turning that into information upon which to act, now that is powerful. The fact is, the pendulum of where that useful information can be mined is rapidly swinging from the structured to the unstructured. Storing, managing and mining that data is a different proposition than the traditional approach and that is why there is so much focus on big data.