The problem with data science job postings

Jeremie Harris Jeremie Harris
July 18, 2019 Big Data, Cloud & DevOps

Every once in awhile, you notice something that you realize you probably should have noticed a long time ago. You start to see it everywhere. You wonder why more people aren’t talking about it.

For me, “every once in a while” was yesterday when I was scrolling through the #jobs channel and the “something” is a big problem in the data science industry that I really don’t think we’re taking seriously enough: the vast majority of data science job descriptions do not convey the actual requirements of the position they’re advertising.

In other words, when senior data scientists are called upon to recruit “for real”, their first move is often to throw away the job posting altogether.

This is not good, for several reasons. First, a misleading job description means that recruiters get a *ton* of irrelevant applications, and that candidates waste a *ton* of time applying to irrelevant positions. But there’s another problem: job descriptions are the training labels that any good aspiring data scientist will use to prioritize their personal and technical skills development.

Despite the obvious downsides of these mangled job postings, companies keep putting them out there, so a very natural question to ask is: why? Why are job postings so confusing (in that they fail to clearly specify the skills they expect from a candidate), or so outrageously over-reaching (“looking for a machine learning engineer with 10 years’ experience in deep learning…”)?

There are many reasons. For one, companies make hiring decisions based on a candidate’s (perceived) ability to solve a real problem that they actually have. Because there are many ways to solve any given data science problem, it can be hard to narrow down the job description to a specific set of technical skills or libraries. That’s why it usually makes sense to put in an application for a company if you think you can solve the problems they have, even if you don’t know the specific tools they ask for.

Another possible reason is that many companies don’t actually know what they want — especially companies with relatively new data science teams — either because the early stage of their data science effort forces everyone to be a jack of all trades, or because they lack the expertise they need to even know what problems they have, and who can help solve them. If you come across an oddly non-specific posting, it’s worth taking the time to figure out which bucket it belongs to, since the former can be a great experience builder, whereas the latter can be a recipe for disaster.

But perhaps the most important reason is that job postings are often written by recruiters, who are not remotely technical. This has the unfortunate side-effect of resulting in occasionally incoherent asks (“Must have 10+ years’ experience with deep learning…”, “…including natural language toolkits, such as OpenCV…”) or asks that no human being could possibly satisfy.

The net result of this job qualifications circus is that I regularly get questions from our mentees about whether they’re qualified for an opening, despite their having read all the information available on the internet about that position. Those questions are actually surprisingly consistent — so much so that I think it’s worth listing the answers to the most common ones here, in the form of simple rules you can follow to make sure you’re applying to the right roles (and not being scared away by fake requirements):

  • If a company asks for more than 6 years of deep learning experience, then their posting was written by someone who has zero technical knowledge (AlexNet came out in 2012, so this basically narrows the field down to Geoff Hinton’s entourage). Unless you want to build a data science team from the ground up (which you shouldn’t if you’re new to the field), this should be a big red flag.
  • If you have no prior experience, don’t bother applying to jobs that ask for more than 2 years of it.
  • When they say “or equivalent experience”, they mean, “or about 1.5X that much experience working in a MSc or a PhD where you worked on something at least related to this”.
  • If you meet 50% of the requirements, that might be enough. If you meet 70%, you’re good to go. If you meet 100%, there’s a good chance you’re overqualified.
  • Companies *usually* care less about the languages you know than the problems you can solve. If they say Pytorch and you only know TensorFlow, you’re probably going to be ok (unless they stress the Pytorch part explicitly).
  • Don’t ignore lines like, “you should be detail-oriented and goal-driven, and thrive under pressure”. They sound like generic, cookie-cutter statements — and sometimes they are — but they’re usually written in a genuine attempt to tell you what kind of environment you’ll be getting yourself into. At the very least, you should use these as hints about what aspects of your personality you should emphasize to establish rapport with your interviewers.

None of these rules are universally applicable, of course: the odd company will insist on hiring only candidates who meet all their stated requirements, and others will be particularly interested in people who know framework X, and will disregard people who can solve similar problems, but with different tools. But because there’s no way to know that from job descriptions alone (unless they’re explicit about it), your best bet is almost always to bet on yourself and throw your hat in the ring.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Jeremie Harris

    Tags
    Data Science
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    Evolving Deep Neural Networks

    Evolving Deep Neural Networks

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Big Data, Cloud & DevOps
    Big Data, Cloud & DevOps
    Cognitive Load Of Being On Call: 6 Tips To Address It

    If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

    5 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    How To Refine 360 Customer View With Next Generation Data Matching

    Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

    4 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    3 Ways Businesses Can Use Cloud Computing To The Fullest

    Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

    7 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.