Need Custom Training for Your Team?

Get Quote

Call Us

Toll Free (844) 397-3739

Inquire About This Course


Thumb e4f55961 f1da 4828 9a62 e68e0dcb51e9

Michael Luk

Dr. Luk studied theoretical physics at Imperial College London and Mathematics at the University of Cambridge before completing his doctorate in Particle Physics, winning Brown University's graduate award for the Physical Sciences in 2013. After graduating, he worked at Intel Corp, where he developed machine learning algorithms to model yield metrics. He is currently the CTO of SFL Scientific, a data science consultancy, where he works on big data projects ranging from NLP to machine vision.

Introduction to Applications of Data Science in the Healthcare Industry

Instructor: Michael Luk

Learn about how data science is utilized in the Healthcare Industry

  • Learn how data science is applied in the Healthcare Industry 
  • Covers a wide range of fields from NLP to Image Recognition

Course Description

In the next 5 years, machine learning will play an increasingly important role in healthcare. Whether it's aggregating new results in medical journals using Natural Language Processing, predicting diseases using Time-Series Analysis, or detecting cancer from MRIs using Machine Vision, healthcare is on the verge of a big data revolution. The purpose of this course will be to introduce you to these topics and more. We start from the basics of machine learning and guide you through how to apply these techniques to real-world healthcare applications. Whilst this course uses healthcare use cases as examples, the techniques are general and apply to a wide range of industries and scientific fields.

What am I going to get from this course?
  • Understand the underlying concepts and algorithms utilized in the Healthcare domain
  • Be able to apply machine learning to real life Healthcare applications
  • Be able to apply machine learning techniques to general applications in industry using the ideas, concepts, and methods discussed


Prerequisites and Target Audience

What will students need to know or do before starting this course?
  • Working knowledge of how to program
  • Basic statistics and probability

Who should take this course? Who should not?
  • Anyone who is interested in learning about how data science is used in the industry


Module 1: Basic Concepts, Algorithms, and Validation Methods
Lecture 1 Introduction

Some information about the background of the instructor and his team

Lecture 2 Introduction Continued

Some information about the instructing team's experience in the medical field

Lecture 3 Motivation and Goals

Why you should learn data science, and what the goals are for this course

Lecture 4 Prerequisites and Course Overview

What the prerequisites for this course are, and an overview of what the course will cover

Lecture 5 Machine Learning Overview

Gives an overview of different types of high-level Machine Learning methods

Lecture 6 Unsupervised Learning

An introduction to what unsupervised learning is and an overview of the varieties of algorithms that are commonly used.

Lecture 7 Introduction to Supervised and Semi-supervised Learning

An overview of Supervised and Unsupervised learning

Lecture 8 Bias-Variance Trade-off

An explanation of the bias-variance trade-off and how you need to think about it when tackling any machine learning problems.

Lecture 9 Validation Methods

A look at how you can validate your data to determine if you are in the high bias or variance regimes.

Lecture 10 Model Complexity

Determining whether or not your model is too complex or too simple is a big issue in machine learning. In this brief video, we'll discuss how you can determine where your model is.

Lecture 11 Quantity of Data

A look into how the quantity of data is important, and how you can tell if you are data limited.

Lecture 12 Summary
Quiz 1 Module 1: Recap

Recap of all topics considered in Module 1.

Module 2: Clustering and Dimensionality Reduction
Lecture 13 Recap

Brief recap of Module 1 and introduction to clustering and dimensional reduction techniques.

Lecture 14 Linear Regression

Linear regression is one of the simplest models to fit on data.

Lecture 15 Logistic Regression

Our first classification algorithm.

Lecture 16 K-Means Clustering

Simple clustering method using k clusters and their centres.

Lecture 17 Hierarchical Clustering

Common clustering method using a hierarchy structure.

Lecture 18 Anomaly Detection

Methods to detect anomalous data.

Lecture 19 K-Nearest Neighbours

A simple algorithm using k nearest neighbors.

Lecture 20 Forward-Backward Selection

A greedy algorithm for dimensional reduction.

Lecture 21 Principal Component Analysis

Another useful dimensional reduction technique.

Lecture 22 Summary
Quiz 2 Module 2: Recap

Quiz covering all Module 2 material.

Module 3: Time Series Analysis on EEG Readings
Lecture 23 Recap
Lecture 24 What is Time Series Data?

dun dun dun...

Lecture 25 Validation

How to validate time-series data.

Lecture 26 Decomposition

Decomposing time-series into seasonal components and extracting the underlying trend.

Lecture 27 Stationary

The important concept of whether a distribution is stationary and how to test for it.

Lecture 28 ACF and PCF

Auto and Partial-Auto Correlation Functions.

Lecture 29 ARIMA Models

Modeling time-series data with ARIMA models.

Lecture 30 EEG

Our first case-study with some real-world EEG data.

Lecture 31 Feature Generation

Generating features for supervised learning methods.

Lecture 32 Time Series Workflow

A walk-through of how to analysis time-series data.

Lecture 33 Time Series Classification

Classifying time-series data using machine learning methods.

Lecture 34 More Features

Additional more complicated features to improve classification accuracy.

Lecture 35 Summary
Quiz 3 Module 3: Recap

Quiz for all material in Module 3.

Module 4: Machine Vision: Cancer Detection and Deep Learning
Lecture 36 Recap
Lecture 37 Overview of this Module

Lecture 38 Machine Vision

What does it mean for a computer to understand data from images?

Lecture 39 Convolutional Neural Networks

A state-of-the-art method to extract data from images.

Lecture 40 Convolution

What is a convolution?!

Lecture 41 Neural Networks

A brief overview of how neural networks work.

Lecture 42 Neural Networks Continued
Lecture 43 Putting it Together: CNNs

Combining the components to form a Convolutional Neural Network

Lecture 44 Case Study: Diabetic Retinopathy

Applying a CNN to a real-world case in the medical field.

Lecture 45 Validation

How do you validate images?

Lecture 46 Exploration

How to build and model and a closer look at the data.

Lecture 47 Preprocessing

Cleaning the data is very important!

Lecture 48 Data Augmentation

Balancing the classes of your data for the best results.

Lecture 49 Extras
Lecture 50 Summary
Quiz 4 Module 4: Recap

Quiz for all material in Module 4.

Module 5: Natural Language Processing: Text Classification to Sort Patient Information
Lecture 51 Recap
Lecture 52 Overview
Lecture 53 Natural Language Processing

The different aspects of natural language processing

Lecture 54 Tokenization

A common step in many NLP problems is to tokenize the text data.

Lecture 55 N-grams

A very simple model based on Bayes' theorem and word sequence occurrences.

Lecture 56 Bigram

Looking at the simple N=2 gram case and building our own bigram model.

Lecture 57 Smoothing

Smoothing the data for words that don't occur in the training set. This process allows the modeling of text with words/tokens not in your corpus.

Lecture 58 Information Extraction

Only the simplest method of regex is covered in IE here. Other methods of information extraction include supervised and semi-supervised methods. We can create features based on word shapes and lengths - e.g. does it contain the letter X or V, how long is the word, is it hyphenated, etc? All these may allow a standard classification algorithm to tag words as the information you wish to extract. More intelligent sequencing models also exist that try to model entire sentences as a sequence of word classes. These include Hidden Markov models, Conditional Random Fields etc - these are considered state of the art but difficult to construct.

Lecture 59 Regular Expressions

Using the simplest method to extract information from text is to look for pattern matching.

Lecture 60 Bag of Words Representation

A common representation for text in NLP problems.

Lecture 61 Text Classification

Classifying text documents using machine learning.

Lecture 62 Preprocessing

Cleanup and data!

Lecture 63 Classification

Classifying documents

Lecture 64 Summary

Quiz 5 Module 5: Recap

Quiz for all material in Module 5.


1 Review

Empty user
Zhen W

December, 2016

This gave me a much better understanding of the possibile applications for Healthcare! Great!