What actually is Big Data?
New
technology and innovation often bring about new terminologies. With Big Data,
this is exactly the case. But what does Big Data really mean?
It appears that so far there is no standard
definition for the term Big Data. A search reveals that various explanations have
evolved over time.
■
In
2009, Adam Jacobs described Big Data as “Data whose size forces us to look
beyond the tried-and-true methods that are prevalent at that time” in his
interesting article “The pathologies of Big Data” (http://queue.acm.org/detail.cfm?id=1563874)
Jacobs argues that getting stuff into databases is easy, but getting it out (in
a useful form) is hard; the bottleneck lies in the analysis rather than the raw
data manipulation.
■
In
2011, IBM, which has the Big already in its nickname "Big Blue" in turn focuses
on the three V’s on its definition of Big Data
- Volume – Big Data comes in one size: large. Enterprises are awash with data, easily amassing terabytes and even petabytes of information.
- Velocity – Often times-sensitive, Big Data must be used as it is streaming into the enterprise in order to maximize its value to the business.
- Variety – Big Data extends beyond structured data, including unstructured data of all varieties: text, audio, video, click streams, log files and more. (http://www-01.ibm.com/software/data/bigdata/)
IBM is one of the pioneers of bringing Big Data analyses to their customers.
I highly recommend taking a look at their eBook titled “Understanding Big Data”
■
Recently,
the McKinsey Global Institute, the research arm of McKinsey and Company pointed
out that no specific threshold can be set for amounts of data to be accounted
for as Big Data by saying: “Big Data” refers to data sets whose size is beyond
the ability of typical database software tools to capture, store, manage, and
analyze. This definition is intentionally subjective and incorporates a moving
definition of how big a data sets needs to be in order to be considered as Big Data
- i.e., we don’t define Big Data in terms of being larger than a certain number
of terabytes (thousands of gigabytes). We assume that, as technology advances
over time, the size of data sets that qualify as Big Data will also increase.
Also note that the definition can vary by sector, depending on what kinds of
software tools are commonly available and what sizes of data sets are common in
a particular industry. With those caveats, Big Data in many sectors today will
range from a few dozen terabytes to multiple petabytes (thousands of
terabytes). The consultancy also provides insides into the financial
opportunities associated with the topic. Check out their report.
What do all these definitions have in common?
They highlight that existing approaches to collecting, handling and
analyzing data no longer help companies to gain a competitive
advantage. In contrast, new approaches are needed to take into account
the exponential speed of change. It seems that Big Data calls for
a) Radical thinking
and
b) Willingness to deal with uncertainty
We will investigate these points further and
keep you posted!
No comments:
Post a Comment