May 2, 2013 Leave a comment
I’ve been observing a rather distasteful trend in big data for the enterprise market over the past 18 months that has reached the point of wanting to share some thoughts despite a growing mountain of other priorities.
As the big data hype grew over the past few years, much of which was enabled by Hadoop and other FOSS stacks, internal and external teams serving large companies have perfected a (sweet spot) model that is tailored to the environment and tech stack. Many vendors have also tailored their offerings for the model backed up with arguably too much funding by VCs, dozens too many analyst reports, and a half-dozen too many CIO publications attempting to extend reach and increase the ad spend.
The ‘sweet spot’ goes something like this:
- Small teams consisting of data scientists and business analysts.
- Employing exclusively low cost IT commodities and free and open sources software (FOSS).
- Targeting $1 to $5 million projects within units of mid to large sized enterprises.
- Expensed in current budget cycle (opex), with low risk and high probability of quick, demonstratable ROI.
So what could possibility be wrong with this dessert—looks perfect, right? Well, not necessarily—at least for those of us who have viewed a similar movie many times previously, with a similar foreshadowing, plot, cast of characters, and story line.
While this service model and related technology has been a good thing generally, resulting in new efficiency, improved manufacturing processes, reduced inventories, and perhaps even saved a few lives, not to mention generated a lot of demand for data centers, cloud service providers and consultants, we need to look at the bigger picture in the context of historical trends in tech adoption and how such trends evolve to see where this trail will lead. Highly paid data scientists, for example, may then find that they have been frantically jumping from one project to the next inside a large bubble with a thin lining, rising high over an ecosystem with no safety net, and then suddenly find themselves a target of flying arrows from the very CFO who has been their client, and for good fundamental reasons.
As we’ve seen many times before at the confluence of technology and services, the beginning of an over-hyped trend creates demand for high-end talent that is unsustainable even often in the mid-term. Everyone from the largest vendors to emerging companies like Kyield to leading consulting firms and many independents alike are in general agreement that while service talent in big data analytics (and closely related) are capturing up to 90% of the budget in this ‘sweet spot’ model today, the trend is expected to reverse quickly as automated systems and tools mature. The reaction to such trends is often an attempt to create silos of various sorts, but even for those in global incumbents or models protected by unions and laws like K-12 in the U.S., it’s probably wise to seek a more sustainable model and ecosystem tailored for the future. Otherwise, I fear a great many talented people working with data will find in hindsight that they have been exploited for very short-term gain in a model that no longer has demand and may well find themselves competing with a global flood of bubble chasers willing to work cheaper than is even possible given the cost of living in their location.
What everyone should realize about the big data trend
While there will likely be strong demand for the very best mathematicians, data modelers, and statisticians far beyond the horizon, the super majority of organizations today are at some point in the journey of developing mid to long-term strategic plans for optimizing advanced analytics, including investments not just for small projects, but the entire organization. This is not occurring in a vacuum, but rather in conjunction with consultants, vendors, labs and emerging companies like ours that intentionally provide a platform that automates many of the redundant processes, enable plug and play, and make advanced analytics available to the workforce in a simple to use, substantially automated manner. While it took many years of R&D for all of the pieces to come together, the day has come when physics allows such systems to be deployed and so this trend is inevitable and indeed underway.
The current and future environment is not like the past when achieving a PhD in one decade will necessarily provide job demand in the next, unless like everyone else in society one can continue to grow, evolve and find ways to add value in a hyper competitive world. The challenges we face (collectively) in the future are great and so we cannot afford apathy or wide-spread cultures that are protecting the (unsustainable) past, but rather only those attempting to optimize the future.
In our system design, we embrace the independent app, data scientist, and algorithm, and recommend to customers that they do so as well—there is no substitute for individual creativity—and we simply must have systems that reflect this reality, but it needs to occur in a rationally planned manner that attacks the biggest problems facing organizations, and more broadly across society and the global economy.
The majority seem frankly misguided on the direction we are headed: the combination of data physics, hardware, organizational dynamics and economics requires us to automate much of this process in order to prevent the most dangerous systemic crises and to optimize discovery. It’s the right thing to do. I highly recommend to everyone in related fields to plan accordingly as these types of changes are occurring in half the time as a generation ago and the pace of change is still accelerating. At the end of the day, the job of analytics is to unleash the potential of customers and clients so that they can then create a sustainable economy and ecology.