Foundations of Data Science: Inferential Thinking by Resampling

This course is provided by

Course Description

Instructors:  Ani Adhikari
Instructors:  John DeNero
Instructors:  David Wagner
School:  BerkeleyX

Using real-world examples from law, medicine and football, you’ll discover how data scientists make conclusions about unknowns based on the data available. Often, the data we have is incomplete, yet we’d still like to draw inferences about the world and quantify the uncertainty in our conclusions. This is called statistical inference. In this course, you will learn methods for statistical inference and see how to apply them to real-world data sets.

The course will teach you estimation: given a random sample, estimate some quantity that we cannot observe directly. You will also learn how to quantify the uncertainty in your estimate. Second, the course will teach you about hypothesis testing, which allows us to evaluate theories or hypotheses about how the world works. In hypothesis testing, we compare what the theory would predict to the actual observations and data we have, to determine whether the theory appears to be consistent with the available data. You will also learn how to quantify the uncertainty in the conclusions you draw using hypothesis testing. This helps assess whether patterns that appear to be present in the data actually represent a true relationship in the world or whether they might merely reflect random fluctuations due to noise. You will learn multiple methods for estimation and hypothesis testing, based on simulation, the bootstrap method, and A/B testing for comparing two random samples. Finally, you will learn about randomized controlled experiments and how to draw conclusions about causality.

The course emphasizes the conceptual basis of inference, the logic of the decision-making process, and the sound interpretation of results.