The cold start problem: how to break into machine learning

Edouard Harris Edouard Harris
February 26, 2019 AI & Machine Learning

It’s hard to get hired into your first machine learning role. That means I get asked lots of questions that look like: right now, I do X. I want to become a machine learning engineer. How do I do it?

Depending on what X is, that ranges from something that’s pretty hard to do (physics, software engineering) up something that’s extremely hard to do (UI design, marketing).

So to help everyone at the same time, I’ve put together a progression that you can follow from any starting point to actually become a machine learning engineer. Every ML engineer I’ve met (who didn’t go to grad school at SAIL or somewhere similar) went through some form of this.

Before you start learning ML, there’s a set of basics you need first.

1. Learn calculus

The first thing you need is multivariable calculus (up to second-year undergrad).

Where to learn it: Khan Academy’s differential calculus course is pretty good. Make sure to do the practice problems. Otherwise you’ll just nod along with the course and won’t learn anything.

2. Learn linear algebra

The second thing you need is linear algebra (up to first-year undergrad).

Where to learn it: Rachel Thomas’s mini-course on computational linear algebra is targeted at people who want to learn ML.

NOTE: I’ve heard convincing arguments that you can skip calculus and linear algebra. A few people I know have jumped right into ML and learned most of what they needed by trial and error and intuition, and they turned out okay. Your mileage will vary, but whatever you do, don’t skip this next step:

3. Learn to code

The last thing you need is programming experience in Python. You can do ML in other languages, but these days Python is the gold standard.

Where to learn it: Follow the advice in the top answer of this Reddit thread. You should also pay close attention to the numpy and scipy packages. Those come up a lot.

There’s more to say about good programming practice than I have room for here. In one sentence: make your code legible and modular, with good tests and error handling.

Pro tip: If you’re learning to code from scratch, don’t bother memorizing every command. Just learn how to look up questions online fast. And yes, this is what the pros do.

Also: learn the basics of git. It pays off fast.

Learn machine learning

Now you get to learn machine learning itself. In 2018, one of the best places to do that is Jeremy Howard’s fast.ai course, which teaches ML at the state of the art with an approachable curriculum. Go through at least Course 1, and ideally Course 2, do all the exercises, and you’ll be ahead of most industry practitioners on model-building (really).

Most of the progress in machine learning over the past 6 years has been in deep learning, but there’s much more to the field. There are also decision trees, support vector machines, linear regression, and a bunch of other techniques. You’ll run into these as you progress, but you can probably learn them as they come up. A great centralized place to learn and use them is Python’s scikit-learn package.

Build personal projects

Everyone who applies to their first ML position has done personal projects in machine learning and data science, so you should too. But it’s important to do it well, and I’ll cover exactly how in a future post. For now, the only thing I’ll say is: the most common mistake I see when people showcase personal projects is that they apply well known algorithms to well known datasets.

This is a mistake because (1) machine learning hiring managers already know all the well known datasets, and (2) they also know that if you showcase a project where you apply a well known algorithm to a well known dataset, you might not know how to do much of anything else.

Some things are hard to learn by yourself

The truth is that a lot of the things that make you stand out from the crowd are hard to learn by yourself. In machine learning, the three biggest ones are (1) data prep, (2) ML devops, and (3) professional networking.

Data prep is the hacks you use when you work with realistic data. That means dealing with outliers and missing values. But it also means collecting data yourself when there isn’t already a dataset for the problem you want to solve. In real life, you’ll spend 80% of your time cleaning and collecting data. Model-building is an afterthought in the real world, and engineering managers know that.

ML devops is what you do to run your model on the cloud. It costs money to rent compute time, so people sometimes don’t do this in their personal projects. But if you can afford it, it’s worth it to get familiar with the basics. Start with Paperspace or Floyd for an intro to running ML on the cloud.

Honest engineers often ignore networking, because they think they should get hired on their skills alone. The real world doesn’t work like that, even though it should. So talk to people. I’ll write more about this part in a future post.

Ask for help

Some steps are hard to take on your own. Schools aren’t good at teaching data prep, ML devops, or networking. Most people learn those things on the job, or from a mentor if they’re lucky. Many people never learn them at all.

But how do you bridge that gap in the general case? How do you get a job without experience when you need a job to get experience?

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Edouard Harris

    Tags
    Machine Learning
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    Top IoT Trends in 2019

    Top IoT Trends in 2019

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in AI & Machine Learning
    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.