Google Trends shows no organic searches for term “big data” until 2011 and an approximately 7-fold increase in the next two years, and then searches doubled from 2013 to 2015. Google projects further increase in the following years albeit at a lower rate. Google searches for Hadoop, the most popular software for handling big data, show a steady double increase in every year since 2007. These facts clearly show the hype behind big data that unfortunately is not reflected in actual deployments at companies. Less than 20% of companies have big data deployments in production while half of the companies are experimenting with this new technology.
In 2015 it will be hard to find a convention or conference not including ‘big’ with the term ‘data’ in the title. Recently at a round table about the labor market in the US a CEO of a software company has been discussing the company’s offering and business plan. It struck me that not a single time he used the word ‘data’ without adjective ‘big,’ and he was not using these terms sporadically. Really? Did today’s companies ditch all relational systems in favor of big data solutions and transitioned from EDW’s (enterprise data warehouses)to data lakes? And this company is not one of the big data solution providers, which perhaps are the only companies that can equate data with big data. It was clear from the presentation that perhaps only a tiny piece of the business is actually built on big data technologies and everything else on relational databases.
As another example of this euphoria I have received a phone call from an IT manager of a law firm who was googling and learning about big data technologies to analyze his data stored in a single spreadsheet. Really? Apparently we need big data to analyze structured data with less than one million records and a hundred attributes.
Upper management in many firms hears the words “big data” and is knocking on the doors of IT and other groups with the question “Are we doing something in big data? If not, we have to.” Since when are business decisions made based on seeing which way the wind is blowing?
It is clear that we do not use the term data anymore but a new single word “bigdata” should be added to the dictionary. All existing relational databases and EDW’s together with true big data solutions such as hadoop should be included in this term. On the other hand, the truth is that it does not matter what color is the facade because the inside of the house drives most of the value of a property. The business value of big data – in the original meaning of the two words and not “bigdata” – reflected by production deployments clearly indicates that the big data house has a huge, artistic facade but is pretty empty inside. The inside decoration has barely started by adding a few chairs and tables to some rooms.
EDW’s based on relational databases have a long history and a bright future ahead. They will remain the tool of choice for handling transactional data and providing a single version of the truth for an entire organization with strong governance and security. They should be augmented with new technologies based on unstructured (should be really called multi-structured) data. In most business cases big data is used to integrate various data sources, which yields big and multi-structured data. A typical use case is the 360 view of customers.
The journey to artistic paintings on the walls is a long way to go after all of the chairs and tables are situated. And we can call the house “big data” or “bigdata” as a synonym for data in which case it is a mansion; all that matters is what is inside the house.