Wednesday, March 21, 2012

Cloud BI Progress & Pitfalls

In my on-going effort to uncover and discuss key BI industry trends, I recently authored a new article for my TDWI column (called “The BI Revolution”), under the same headline as this post. In that article, I focused on the big market that will emerge for BI in the cloud. Even more importantly, I shed light on the definitional and technological pitfalls that are confusing this market as it seeks to deliver more efficient cloud-based business intelligence.

Rather than address my main points here, I encourage you to read my post at the TDWI website and then add your comments and thoughts here.

Cloud BI = BI for SaaS + BI for PaaS
I note that the cloud as a transformational infrastructure will drive big use of BI for SaaS (on-demand analytical applications) and BI for PaaS (application development and deployment in the cloud). I am less bullish on SaaS BI (on-demand, general-purpose BI in the cloud) because I believe growth will continue to be fueled by BI embedded in data-driven applications, rather than delivered in any standalone use.

We’re constantly tuning the Jaspersoft website on this topic, building out content that seeks to explain, educate and amplify the technological and business benefits of BI in the Cloud. One important point left out of my TDWI post describes Jaspersoft’s focus on and success in BI for PaaS (platform-as-a-service).

Recently, Jaspersoft has been very active in BI for PaaS. We are working with all the major PaaS providers to ensure our BI platform is available within these new cloud-based development and deployment environments. Just last month, Jaspersoft announced an important partnership with Red Hat, making our BI server available immediately in the OpenShift (public cloud) and CloudForms (private cloud) environments. Then, Jaspersoft produced a blog post and video to highlight its support of VMWare’s CloudFoundry PaaS environment, with a more formal announcement pending. Overall, our head of Product & Alliances summed it up best:

“Jaspersoft’s intention is to be the de facto standard in BI for PaaS, enabling the broadest community of software developers to use our tools in their favorite cloud environment,” said Karl Van den Bergh, Vice President of Product & Alliances at Jaspersoft. “We are uniquely positioned to capitalize on this shift of application development to the cloud with our modern architecture, the world’s largest BI community building data-driven applications, and our open source model.”

Through my recent TDWI article and this post, my goal is to clarify the cloudy definitions around Cloud BI, the important pitfalls already witnessed, and the progress we can point to as a sense of optimism for what will be a bright Cloud BI future.

Brian Gentile
Chief Executive Officer

Thursday, March 1, 2012

Got Big Data?

If competing based on time and information really will drive the next major economic era, then Big Data is real and represents a huge opportunity. If you’re a business analyst or technologist responsible for mapping data to decisions, then the variety, velocity, and volume of data available to you today has never been richer. And, your responsibility has never been greater.

I’ve previously discussed the different classes of data source technologies that can legitimately be used to harness (or tame) big data. Hadoop is one of those technologies, as the most popular software framework associated with this rising trend. Others include NoSQL databases, MPP data stores and even ETL/Data Integration approaches (for moving Big Data by the batch into some more usable format). Each of these technologies align with an appropriate use-case that makes more understandable the variety of products emerging in this world of Big Data.

For simplicity, I like to talk about three popular approaches to connecting to and making use of Big Data for business intelligence reporting and analysis.

Interactive Exploration – the most dynamic because it involves native connectivity directly from the BI tool to the Big Data source and can offer results in near-real-time. Hadoop HBase, Hadoop HDFS, and MongoDB are just three of the most popular data sources to which direct connection would be an advantage.

Direct Batch Reporting – an important and mainstream approach (especially in this early market of Big Data) that relies on tried-and-true SQL access to Big Data. Hadoop Hive is the best known example, but Cassandra offers CQL access that delivers similar results and functionality.

Batch ETL – using extract, transform and load techniques to create a more usable subset of the Big Data is also popular, especially when the insight being sought is less urgent, probably in the order of hours or days after data capture. Most every ETL tool has now been improved to connect to and transform Big Data. Some even integrate nicely with underlying Hadoop technologies (like Pig), making the data steward’s life potentially simpler.

Sometime last year, it occurred to me that Jaspersoft is in a unique position with regard to Big Data. Because of Jaspersoft’s data-agnostic architecture, we’ve quickly offered a broad variety of native Big Data connectors, many of which have been available for more than one year (for free download) . . . and because of our large, growing community of developers (we have more than 260,000 registered community members, growing at about 6,000/month at the time of this writing), we have important data about Big Data. This realization led us to the Big Data Index.

Big Data Index

We’ve tracked the downloads of our Big Data connectors over the last year, charting the ups and downs with each, corresponding to the relative rise and fall of their popularity. Over this time, we’ve seen more than 15,000 downloads, so our view is pretty good. Here’s a static version of the latest data for the four most popular Big Data connector downloads:

During the course of the past year, the Hadoop technologies (HBase & Hive combined) proved the most popular. The fastest growing and the leader at the moment is MongoDB (from 10gen). Cassandra holds a solid and consistent fourth position (which should benefit DataStax, the commercial company behind Cassandra). Many other Big Data connectors are tracked as well, with a dynamic chart updated monthly.

As interest in Big Data grows, so will the potential uses for these technologies that are designed to map this data to decisions and insights. At the moment, I’m just content knowing I have a front-row seat via the Big Data Index.

We’re at the very beginning of this era, which will surely be reliant on more data than we could barely fathom just ten years ago. This is why your thoughts and comments on this topic are appreciated.

Brian Gentile
Chief Executive Officer