Data quality can make or break your analysis. Good techniques can't make up for bad data. This course will teach you how to assess the quality of your data and how well your data will serve to answer your questions.
What am I going to get from this course?
Get a better understanding of context and limitations of data. Understand how well-suited
data are for generating meaningful analyses.
Prerequisites and Target Audience
What will students need to know or do before starting this course?
None required. Some experience working with data and some knowledge of statistics will be helpful.
Who should take this course? Who should not?
Anyone who works with data.
Why Do We Need to Understand Data?
Biggest Problems with Data
Data Science Is Both Science and Art
Module 2: Creating a Roadmap for Your Analysis
Why Do I Need an Analysis Roadmap?
Creating an Analysis Roadmap, Part 1
Creating an Analysis Roadmap, Part 2
Exercise 1 - Explanation
Module 3: Understanding Your Data-Generating System, Part I
Overview of Data Generating System
Data Generating System - Content Provider
Data Generating System - Data Collector
Data Generating System - Customer
Data Generating System - Client
Exercise 2: Understanding Data Collector, Customer, Client
Exercise 2 - Explanation: Understanding Data Collector, Customer, Client
Module 4: Understanding Your Data-Generating System, Part II
Understanding Your Data-Generating System, Part II
Context/Environment: The Situation
Context/Environment: Choice Architecture
Context/Environment: TED Talk Example
Exercise 3: Understanding Context/Environment
Exercise 3 - Explanation: Understanding Context/Environment
Module 5: Clearly Understanding Your Data
Why Do I Need to Clearly Understand My Data?
What Information Does Your Data Capture?
What Do the Variables Look Like?
Exercise 4: Understanding Your Data
Exercise 4 - Explanation: Understanding Your Data
Module 6: Understanding Limitations of Your Data
Why Do I Need to Understand Limitations of Data?
Are the Data Biased?
Are the Data Bad Proxies?
Are the Data Inconsistent?
Did Context/Environment Affect the Data?
Exercise 5: Understanding Limitations of the Data
Exercise 5 - Explanation: Understanding the Limitations of the Data
Module 7: Summary and Conclusions