Big Data, Hadoop & Cloud: Tackling a Chain of Emerging Challenges

Chandra Ambadipudi Chandra Ambadipudi
February 5, 2018 Big Data, Cloud & DevOps
Ready to learn Hadoop Big Data? Browse courses like Adopting Hadoop for the Enterprise: From Strategy to Roadmap developed by industry thought leaders and Experfy in Harvard Innovation Lab.
Data has often been heralded as the new “oil” – a commodity more precious than any natural resource in today’s digital economy. To be fair, oil and data is not an apples-to-apples comparison. Data can “drive” an autonomous car, but you can’t fill the gas tank with ones and zeros. However, in line with the analogy, data has travailed phases similar to oil exploration and drilling.
First came the data “land grab” phase. About 10 years ago at the advent of big data hype, companies scrambled to ensure they didn’t miss out. Then came the delineation phase, where the industry more tightly defined big data boundaries and applications. We’re now in an efficiency phase. Just like with oil drilling, extracting maximum value from data is all about combining the right expertise with the right technology.
For all of big data’s promises, many challenges came to light during the delineation phase and continue today as companies implement big data projects. According to Gartner, many organizations that have invested in big data projects remain stuck in the pilot stage. So what are the main challenges causing these stalls?

Big Data Challenges

Traditional big data storage and analysis systems have buckled under the weight of large volumes of unstructured data. Due to cost and scalability issues, companies have shifted to more agile, cost-efficient open source solutions like Apache Hadoop and Spark, as well as Lumify, MongoDB, and Elasticsearch and many others. Navigating the sea of big data tools is its own challenge, but let’s focus on Hadoop, a solution at the center of the big data transformation.
For all of the difficulties many companies experienced in their Hadoop journey over the years, it has now become mainstream, with significant ROI demonstrated across industries. Financial service and healthcare companies are augmenting, and in some cases completely replacing traditional BI/DW-based data management systems with large scale Hadoop deployments.
While Hadoop does solve many data problems, it’s opened up new challenges too. Acknowledging Hadoop’s potential, the hard truth is that Hadoop implementation and management (especially on-premise) is difficult and can end up causing more problems than it solves. Hadoop’s learning curve and required level of expertise per industry and use case can challenge a company’s internal data professionals and strain available IT resources.
Additionally, scaling Hadoop on premise can be a challenge, requiring more investment in physical infrastructure – something many companies don’t have resources for. This is why many enterprises are moving to cloud-based Hadoop solutions, including private, public, and hybrid cloud deployments.

Cloud Migration Challenges

Cloud-based Hadoop solutions allow companies to scale in a more agile fashion as their data needs increase. This can solve the problem of having to add more on-prem infrastructure over time, but as with any solution, migrating big data analytics to a cloud infrastructure begets its own set of challenges.
These challenges largely revolve around ensuring performance, reliability, accessibility, and scalability of data. With big data and cloud implementations, there is also the looming elephant in the room of data security. This key concern has been put in the spotlight with the numerous recent high-profile enterprise data breaches and an ever-growing list of industry regulations such as HIPAA, PCI, PHI, FERPA, and GDPR.
So, what can enterprises do to tackle some of these ongoing challenges?

1. Think 5-10 years out

Enterprises adopting Hadoop (or any) big data tools need to think about what’s coming next. This is especially true for companies building their own big data platforms. Flexibility and scalability are essential as emerging technologies like autonomy, AI, virtual reality and IoT will generate new kinds of data faster than ever before. Big data solutions should be as future proofed as possible, as the last thing an enterprise wants to do is implement new big data infrastructure and tools only to have to turn around and do the same thing in another couple of years.

2. Hire or partner with the right experts

With oil exploration and drilling, the ability to maximize land investment and rig efficiency comes from combining up-to-date technology with the most specific expertise on local geology. In the same way, big data projects are more than just implementing a Hadoop solution. As mentioned earlier, Hadoop implementation and management can be difficult, and having access to experts that understand an individual enterprises’ specific “geology” is key to success. This means making sure you have the right internal talent, or partnering with the right experts.

3. Take a data-centric approach to security

Enterprises need to remember when migrating big data to cloud infrastructure – whether public, private, or hybrid – the onus to secure data is theirs, not the cloud provider’s. No matter what perimeter security measures are taken, data stored in a cloud environment is especially susceptible to breach. Enterprises need to think beyond perimeter security and move to identify sensitive data – both structured and unstructured, then secure it in Hadoop/data lake as it’s ingested, and constantly monitor cloud data sources for violations.

Final thoughts

While big data has dropped off the hype cycle, it’s not going away; in fact it will only get “bigger”. Likewise, the Hadoop ecosystem has matured significantly and will continue to do so with all of the big distributions offering data science capabilities. AI is the next frontier for companies with an existing large Hadoop footprint. The desire amongst enterprises to migrate big data to the cloud will continue to increase with managed services gaining momentum. There are many challenges that come with all of these things – but with the right strategy and foresight, enterprises can truly maximize big data’s value.
  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Chandra Ambadipudi

    Tags
    Big Data
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    Demystifying machine learning: How do machines really learn?

    Demystifying machine learning: How do machines really learn?

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Big Data, Cloud & DevOps
    Big Data, Cloud & DevOps
    Cognitive Load Of Being On Call: 6 Tips To Address It

    If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

    5 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    How To Refine 360 Customer View With Next Generation Data Matching

    Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

    4 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    3 Ways Businesses Can Use Cloud Computing To The Fullest

    Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

    7 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: support@www.experfy.com

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.