Big Data and its analysis techniques are at the center of modern science and business. Every day, millions of transactions, emails, photos, video files, posts and search queries are generated that result in terabytes of data (see figure 1). All that data is stored in databases on various places across the planet.
All that data potentially contains a wealth of information. By analyzing the data that is generated every day, governments, researchers and companies might discover knowledge that they could use to their benefit. For governments, this might be to prevent tax fraud or to increase the economic interests of the nation. For researchers, discoveries in data might help to develop new medicines. And for companies, data analysis might help determine the best location to open a new store in order to obtain a competitive advantage. Although these are different examples, and the value of the information is different for every type of organization, the process to extract insights out of data is very similar.
Extracting valuable knowledge out of massive quantities of data is, however, more difficult than it sounds. Due to the sheer volume of data that is generated every day, databases grow massively, and it becomes difficult to capture, form, store, manage, share, analyze and visualize meaningful insights out of the data.For that reason, knowledge about how to deduce valuable 2 information out of large sets of data has become an area of great interest. This domain of knowledge is collectively described as “Big Data.”
Although the importance of Big Data has been recognized over the last decade, people and organizations still uphold different opinions on its definition. In general, the term Big Data is used when data sets cannot be managed or processed by traditional commercial software (and its underlying IT infrastructure) within a tolerable amount of time. The domain of Big Data is however more all-encompassing than the speed of data transfer, or its required technology, tools or processes. Over time, Big Data has gradually evolved in an entire domain of study that interfaces with data science, machine learning and artificial intelligence.
Although there are many good definitions of Big Data, the Enterprise Big Data Fraemework utilises a definition that focuses on Big Data as a knowledge domain. In all the knowledge base articles you will find on our webiste, we will therefore adhere to the following definition of Big Data:
Big Data is the knowledge domain that explores the techniques, skills and technology to deduce valuable insights out of massive quantities of data.
The objective of the Enterprsie Big Data Framework is discuss these techniques, skills and technologies in a structured approach. With the Enterprise Big Data Framework, we aim to equip every reader with the knowledge and skills to deduce valuable insights out of massive quantities of data. These skills will empower you to obtain fact-based, data-driven information in order to support your future decisions. In order to accomplish this goal, our materials will introduce some fundamental concepts and terminology about data, data structures and the characteristics of Big Data.
In subesequent artilces, we will introduce the Enterprise Big Data Framework, a holistic model of six capabilities to increase Big Data proficiency in enterprises. Each of the remaining sections will subsequently build upon the capabilities of the Enterprise Big Data Framework.