What’s big for big data for 2018

David Mariani David Mariani
February 15, 2019 Big Data, Cloud & DevOps

Ready for Big Data Training & Certification? Browse courses like Big Data – What Every Manager Needs to Know developed by industry thought leaders and Experfy in Harvard Innovation Lab.

 In his last book, “Thank You For Being Late,” Thomas Friedman highlights 2007 as one of the most pivotal years in the technology space: 2007 saw the birth of Hadoop, the iPhone and Amazon’s Kindle for instance.

I’m no Tom Friedman. But, I’ve been in this industry for what seems like an eternity. I have lived through many tech transformation cycles and have heard endless predictions about what was supposed to “happen next.”

In my humble opinion, 2018 will unleash a major disruption for the analytics and data management space. It will upend decades worth of accepted practices and introduce new winners and losers. At the center of the storm is the public cloud and the many implications it will have for big data.

experfy-blog

Prediction #1: The public Cloud becomes the new Data Lake.

Hadoop was the world’s first implementation of a data lake with its native “schema on read” architecture. A data lake is essentially a distributed file system with a catalog, providing users with the flexibility of storing “as is” today while adding a schema only when data needed to be accessed.

The cloud based distributed file systems like Amazon S3, Microsoft ADLS and Google Cloud Storage have similar capabilities and will serve as an alternative data lake platform in the future. Ovum's latest global survey for all big data workloads showed that 27.5% of them are already deployed in the cloud. The industry’s latest Big Data Maturity survey predicted that 72% of you will do Big Data Analytics in the Cloud in the next 4 years. If you’re not convinced of this trend, look again!

Why this matters: Enterprises are looking to the cloud to offload infrastructure management and they already land their data in their respective cloud provider’s data store. Now, with the added capabilities of engines like Spark (from DataBricks, which recently announced $140M in funding) and Presto to query data in situ, cloud customers can harness the power of the data lake without the overhead and cost of managing a Hadoop cluster. I believe that this will become a huge trend that will upend the notion of a data warehouse and bring schema on read to the masses.

Prediction #2: "Insight as a Service” and the Office of the CDO become the norm for the enterprise.

I am seeing more and more enterprises struggle to deal with the unintended consequences of the self-service BI revolution. According to Gartner, the self-service business intelligence space grew by 60 percent in 2015 but it tapered off shortly thereafter. Why? Because, when business users take on more data management tasks, enterprises notice that their employees spend too much time “data-wrangling” and not enough analyzing data to drive revenue and lower costs.

Meanwhile, IT has been struggling to govern, secure and deliver the quality data the business needs. In response to this, an increasing number of enterprises are establishing “centers of excellence” (CoE) to produce "insights as a service”. According to Forrester, the market for “Insights as a Service” will double in 2018 and 80 percent of firms will be relying on such capabilities. The architects of the CoE are the chief data officers (CDO) and the chief analytics officer (CAO). In 2018, we expected enterprises to hire more of them and elevate them to the C suite.

Why this matters: Self-service business intelligence is here to stay, no doubt. However, I predict that we will see an evolution of self-service BI from a “free for all” to a governed data access model managed by a central data group and the CoE. This means that the role of the data engineer will move back to the IT and the business will focus on creating insights instead of data marts. This movement will require a whole new set of tools to facilitate data governance and the semantic layer will again become king.

Prediction #3: Multi-cloud strategy fails as a strategy.

We’ve seen this movie before. The data platform vendors do everything in their power to lock customers into their proprietary ecosystems. So, why would the cloud be any different?

I’ve spoken to a number of large enterprise customers and they all have the intention of using the Cloud vendors as “dumb pipes.” I can’t tell you how many CIOs have told me that they will invest in more than one cloud vendor. The truth is that this runs counter to the plans of the cloud vendors who are pushing their proprietary tools by making them so tantalizingly easy to use. Once any department chooses to leverage a cloud vendor’s tools to deliver a new capability, it’s game over – you’re locked in.

Why this matters: The goal of a multi-cloud strategy is to minimize switching costs. It’s inevitable that your teams will deploy applications that leverage proprietary cloud technologies. The key is to “firewall” or insulate downstream applications and users from those technology choices. That means leveraging cloud independent interfaces and data semantic layers so if you chose to switch your cloud providers, you can minimize the amount of change required to do so. Feel free to educate yourself around the concept of semantic layers and don’t get locked-in.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • David Mariani

    Tags
    Big Data
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    The role of the data curator: Make data scientists more productive

    The role of the data curator: Make data scientists more productive

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Big Data, Cloud & DevOps
    Big Data, Cloud & DevOps
    Cognitive Load Of Being On Call: 6 Tips To Address It

    If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

    5 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    How To Refine 360 Customer View With Next Generation Data Matching

    Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

    4 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    3 Ways Businesses Can Use Cloud Computing To The Fullest

    Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

    7 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.