Friday, July 27, 2012

Big Data: Approaches, Myths & Skills


Last month, my 18-year old daughter asked me about Big Data. This is my first sure sign that a technology has reached a fever pitch in the hype cycle.  Ironically, I found that as I explained this
enterprise IT topic to my daughter, our conversation and the questions she asked did not vary greatly from many conversations I’ve had with other CEOs, journalists, financial analysts and industry colleagues.  Despite how widely Big Data is being covered these days, it appears to me that Big Data is a big mystery to many.


Trying not to be labeled a cynic, I have three big worries about Big Data:

1. My biggest worry is the poor percentage of successful Big Data projects that will emerge as we too quickly throw these new technologies at a wide variety of prospective projects in the enterprise
2. The low success rate of Big Data projects will be amplified by the current hype and subsequent misconceptions about Big Data technologies, and
3. This low project success rate could stay challenged over time because of the relative dearth of
knowledgeable, data-savvy technology and business professionals ready for a world where data are plentiful and analytic skills are not.

Successful Big Data Projects

As organizations race to evaluate and pilot Big Data tools and technologies, in search of an answer to a Big Data opportunity, I’ve seen evidence that architectural steps are being skipped in favor of speed.  Sometimes, speed is good.  In the case of Big Data, building the right data and platform architecture is critical to actually solving the business problem, which means the right amount of thoughtful planning should occur in advance.  Many missteps could be avoided by simply being clear up-front on the business problem (or opportunity) to be solved and how quickly the data must be used to enable a solution (i.e., how much latency is acceptable?).

Recently, I’ve tried to do my part to help explain successful Big Data (technical) architectures by starting with three simple, latency-driven approaches.  The specifics, including an architectural diagram, are described in my recent E-Commerce Times article, entitled “Match the Big Data Job to the Big Data Solution.” We’ve also posted additional graphics and explanation to the Big Data section of the Jaspersoft website.

Big Data Misconceptions (or Myths)
To reduce the hype, first we must overcome the misconceptions. My many conversations on the topic of Big Data yield equally many misconceptions and misunderstanding. Some examples of the most common myths: Big Data is all unstructured, Big Data means Hadoop and Big Data is just for sentiment analysis. Or course, each of these myths is only partially true and requires a deeper understanding of the technologies and their potential uses to gain real clarity.


I’ve recently offered a brief article that seeks to dispel the “Top 5 Myths About Big Data.” Published last month on Mashable. The article has garnered some great comments with the most completewritten by IBM’s James Kobielus. James improves and amplifies several of my major points. I hope you’ll join the conversation.

Analytic Skills Shortage
Worldwide digital content will grow 48% in 2012 (according to IDC), reaching 2.7 zettabytes by the end of the year.   As a result, big data expertise is fast-becoming the “must-have” expertise in every organization.  At the same time, in its 2011 research report, titled “Big data: The Next Frontier for Innovation, Competition, and Productivity,” McKinsey offered the following grim statistic:

“By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.”

Without the solid analytic skills needed to support a growing array of Big Data projects, the risk potential grows rapidly.  Anyone in or near data science should take the coming skills shortage as a call-to-arms.  Every college and university should be building data analytics coursework into compulsory classes across a wide variety of disciplines and subject areas. Because of its importance, I’ll save this Big Data skills topic as the thesis for a future post.

Despite these primary worries, I remain hopeful (even energized) by the enormous Big Data opportunity ahead of us.  My hope is that, armed with good information and good technology, more Big Data customers and projects will become more quickly successful.

Brian Gentile
Chief Executive Officer
Jaspersoft Corporation