Is Hadoop ‘In Production’ Or Not? – Forbes Blog by John Webster

By , Friday, October 16th 2015

Categories: Analyst Blogs

Tags: Hadoop,

“In production” is a phrase often used by enterprise data center administrators to delineate applications and system that are used to support an organization’s day-to-day operations from those in a status that is something else—in test for example. And reaching production status is now increasingly important for the leading commercial Hadoop distribution vendors (Cloudera, Hortonworks, MapR,) and others who seek opportunity in the Hadoop ecosystem. The reason for sensitivity to this issue is simple. If an enterprise can’t transform its interest in Hadoop from shiny new toy to at least mission-sensitive if not mission-critical, then why bother playing with it any longer.

HadoopThe importance of this dividing line can be sensed when one reviews the results of some recently reported Hadoop user survey results. Google “Hadoop User Survey” and you’ll get conflicting impressions on Hadoop usage from optimism to pessimism. This one conducted by AtScale, a Hadoop services organization and published by siliconANGLE, indicates that Hadoop users are presently at a 50/50 “tipping point” between seeing or not seeing tangible value in the platform, and in glass-half-full fashion states that Hadoop adoption is “growing and accelerating.” But, another gloomier one from Gartner states that future demand for Hadoop among their 284 Gartner Research Circle members looks “fairly anemic over the next 24 months.” Both surveys were conducted during the first half of this year.

I got a sense of this divide directly at the recent Stata + Hadoop World conference. The Hadoop distribution vendors I spoke to generally reported strong growth as evidenced by an increasing number of Hadoop deployments that were now “in production.” This included a growing list of applications Hadoop was supporting within existing customers. However, during a lunch table conversation, a user from the credit card services industry told me that, while they had multiple Hadoop clusters, not one was “in production.” Moreover, he said that he had yet to find other users of Hadoop at the conference (again) “in production.” The problem for him was lack of consistency: running the same query multiple times against the same data set produced different results. As such, he couldn’t recommend production status even thought his organization was running applications on—and likely paying for—at least one of the commercial Hadoop distributions.

In the case of Hadoop, the phrase “in production” means different things to different users as well as vendors. Hadoop has and still is overcoming hurdles to adoption by enterprise users. Early-on, the fact that Hadoop’s Name Node represented a single point of failure was a known problem that could be dealt with by webscale users. Not so by others in financial services and telecommunications for example who need to see a fix before proceeding further. Other criteria for moving Hadoop from test bed to production status include:

  • Readiness to support business-critical applications and their dependent applications
  • Data consistency and integrity
  • Performance and efficiency at scale
  • Conformance to security standards and regulatory requirements
  • Availability of qualified IT staff to manage the platform over the foreseeable future

Crossing the chasm between sand-box and production is becoming increasingly critical for Hadoop ecosystem vendors. It is generally understood that there are two types of Hadoop users; power users with multiple clusters and applications, and experimenters with single 10-20 node clusters. For vendors, significant growth will come from transforming experimenters into the power users. Here I’m saying that getting the Hadoop experimenters to what they consider to be “in production” is key to crossing that chasm.

Click here to read the blog post on Forbes.Com

Forgot your password? Reset it here.