Industry recognized certification enables you to add this credential to your resume upon completion of all courses

Need Custom Training for Your Team?
Get Quote
Call Us

Toll Free (844) 397-3739

Inquire About This Course
Peter Chen, Instructor - Unsupervised Learning: Clustering

Peter Chen

Peter Chen is an analytics and data science professional that has an eclectic and deep background. He has previously worked in various senior positions at companies such as Algebraix Data, Petco, Mitchell International, Sempra Energy ,etc. He has been profiled in various media articles (“A Confession of a Data Miner”) about his analytics and data mining experience. He is widely published in industry trade magazines about analytics & data science. Peter received his BS in Management Science from the Massachusetts Institute of Technology/Sloan School of Management, his Masters in General Management from Harvard and is currently working on his second Masters in Software Engineering/Data Science at Harvard.

Instructor: Peter Chen

Practical Clustering Concepts & Applicatons

  • Understand the power of Gaussian Mixture Models (GMM) to go beyond simple clustering needs
  • Evaluate the quality of clustering using Silhouette plots
  • Instructor has a B.S in Management Science from the Massachusetts Institute of Technology/Sloan School of Management, Masters in General Management from Harvard.

Course Description

My name is Peter Chen and I am the instructor for this course. I want to introduce you to the wonderful world of Unsupervised Machine Learning. Specifically, we will focus on Clustering algorithms and methods through practical examples and code. More importantly, it will get you up and running quickly with a clear conceptual understanding. The course has code & sample data for you to run and learn from. It also encourages you to explore your own datasets using Clustering algorithms. Prerequisites: Beginner knowledge of Python. It's used mostly for expository reasons. You do not need to be a Python expert. Basic math and comfortable with basic probability and statistics.

What am I going to get from this course?

* Understand the major types of clustering algorithms

* Know what, how, when to apply a k-means, GMM, and hierarchical clustering

* Understand the power of Gaussian Mixture Models(GMM) to go beyond simple clustering needs

* Determine the optimal number of clusters 

* Gained an intuition behind the math of the underlying algorithms and be able to explain it

* Learn how to use Python scikit-learn library  to build clustering machine learning models 

* Apply Python code to their data sets to solve clustering various problems

* Evaluate the quality of clustering using Silhouette plots

* Learn about different industry applications of Clustering 

Prerequisites and Target Audience

What will students need to know or do before starting this course?

Basic Python. Do not need to be an expert programmer. We use Python mainly for expository reasons. Basic probability math.  

Who should take this course? Who should not?

Students who are interested in a practical introduction to clustering, a kind of unsupervised machine learning. Want an intuitive understanding of the theory behind clustering.

Students can use these methods and algorithms for hot applications such as marketing analytics, customer segmentation, anomaly detection, fraud detection, and other practical applications in their respective fields. Must like to play with data and code. 


Module 1: Welcome & Introductions

Lecture 1 Welcome to the Course
Lecture 2 Course Overview and Introductions

Module 2: K-Means Clustering

Lecture 3 K-Means Clustering
Lecture 4 How does K-means do that?
Lecture 5 Similarity Measures
Lecture 6 Issues with K-Means

Module 3: Gaussian Mixture Models

Lecture 7 GMM Introductions
Lecture 8 GMM: Code Examples
Lecture 9 GMM as Density Estimators
Lecture 10 GMM: Optimal Number of Components
Lecture 11 GMM - Generate New Data

Module 4: Hierarchical Clustering

Lecture 12 Introductions to Hierarchical Clustering
Lecture 13 Linkage Methods
Lecture 14 Hierarchical Clustering Walk-Through
Lecture 15 Divisive Algorithm
Lecture 16 Hierarchical Clustering - Code Examples

Module 5: Methods for Selecting Number of Clusters

Lecture 17 Methods for Selecting Number of Clusters

Module 6: Evaluating the Quality of the Clustering

Lecture 18 Evaluating the Quality of Clustering

Module 7: Industry Applications

Lecture 19 Industry Applications

Module 8: Mini-Project: Pulling It All Together

Lecture 20 Mini-Project

Module 9: Mini-Project Solution Preview

Lecture 21 Solution Preview