Big Data Analytics - Techniques and Trends - continued..
Welcome
back! So we continue to understand some more techniques and trends to analyze
Big Data. Our idea is not for you to become experts in all of these, but
hopefully to be able to germinate the seed of inquisitiveness in your mind and
simultaneously touch upon the most prevalent concepts.
A couple of
more widely used techniques trying to utilize Big Data potential:
Sentiment Analysis: A technique to identify and extract
subjective information from source text material. Key aspects of these analyses
include identifying the feature, aspect, or product about which a sentiment is
being expressed, and determining the type, “polarity” (i.e., positive,
negative, or neutral) and the degree and strength of the sentiment. Examples of
applications include companies applying sentiment analysis to analyze social
media (e.g., blogs, microblogs, and social networks) to determine how different
customer segments and stakeholders are reacting to their products and actions.
Predictive Analysis: A set of techniques in
which a mathematical model is created or chosen to best predict the probability
of an outcome. It deals
with extracting information from data and using it to predict future trends and
behavior patterns. The core of predictive analytics relies on capturing
relationships between explanatory variables and the predicted variables from
past occurrences, and exploiting it to predict future outcomes. An example of an application in customer relationship management
is the use of predictive models to estimate the likelihood that a customer will
“churn” (i.e., change providers) or the likelihood that a customer can be
cross-sold another product. This is used in conjunction with some earlier described data
analyzing techniques like data mining. Following video is sweet and short
illustration by a Predictive Analytics company http://goo.gl/9k0sP
Now
we look at some buzz words regarding Big Data Analytics as promised before, there
are a growing number of technologies used to aggregate, manipulate, manage, and
analyze Big Data, most of them are based on Distributed Computing platform, which is:
- Massive parallel computing where a problem
is divided into multiple tasks, each of which is solved by one or more
computers working in parallel.
Here
are some trendy technologies:
MapReduce: A software framework
introduced by Google for processing huge data sets on certain kinds of problems
on a distributed system. Check out this nice online presentation for a simple
understanding http://goo.gl/Qz5PP
Mashup: An application that uses and combines data presentation or functionality from two or more sources to create new services. These applications are often made available on the Web, and frequently use data accessed through open application programming interfaces or from open data sources.
Hadoop: An open source (free)
software framework for processing huge data sets on certain kinds of problems on
a distributed system. Its development was inspired by Google’s MapReduce and
Google File System. It was originally developed at Yahoo! and is now managed as
a project of the Apache Software Foundation.
Although
the scope of this genre of technologies is very vast and hard to bring under
the purview of this post, nevertheless, we tried to make you familiar with the
basic concepts. Do let us know your views, see you soon …..
References:
McKinsey report: http://goo.gl/ycvef
TDWI library reports: www.Tdwi.org
Wikipedia
Very informative.
ReplyDelete