EC325 Lecture Notes

EC325: Econometrics | Colby College

Author

Prof. Caraher

Published

March 16, 2026

Preface

Welcome to EC 325: Econometrics at Colby College! This booklet contains the lecture notes, interactive exercises, and R tutorials that will guide you through your first course in econometrics.

What is Econometrics?

Econometrics is the application of statistical methods to economic data with a particular emphasis on causal inference—figuring out whether one thing actually causes another, rather than merely being correlated with it. This distinction matters enormously for policy: if we want to know whether raising the minimum wage reduces employment, whether health insurance improves health, or whether education increases earnings, we need tools that go beyond simple correlations.

This course will teach you those tools. By the end of the semester, you’ll be able to read and critically evaluate empirical research in economics, conduct your own quantitative analyses, and—most importantly—think carefully about when statistical evidence can and cannot support causal claims.

About These Notes

These notes are designed to be a companion to the course, not a replacement for the textbook or class participation. They aim to:

  • Present core concepts in accessible, narrative form
  • Provide extensive R code examples you can run yourself
  • Include interactive exercises to test your understanding
  • Emphasize intuition through simulations before introducing formulas

The content draws heavily from the following texts, which are required or recommended for this course:

  • Wooldridge (2019) — The primary textbook, providing rigorous coverage of econometric methods
  • Angrist and Pischke (2014) — An accessible introduction to causal inference and the “credibility revolution” in economics
  • Bailey (2020) — A practical guide to doing econometrics in R
  • Wickham, Çetinkaya-Rundel, and Grolemund (2023) — The definitive guide to data science in R with the tidyverse

Structure of the Course

The course is organized into three modules:

Module 1: Correlation, Causation, and Regression We begin by establishing why correlation doesn’t imply causation and what additional assumptions we need to make causal claims. We then develop the core tool of econometrics—Ordinary Least Squares (OLS) regression—first in its simple bivariate form, then with multiple control variables.

Module 2: Statistical Inference and Real-World Regression With the mechanics of OLS in hand, we turn to quantifying uncertainty. How confident should we be in our estimates? We develop the tools of hypothesis testing, confidence intervals, and statistical significance. We then explore practical issues like functional form, dummy variables, and interactions.

Module 3: Causal Inference Methods The final module covers the major identification strategies economists use to make causal claims from observational data: instrumental variables, difference-in-differences, and regression discontinuity designs.

Roadmap Through the Midterm

The first half of the course (through the midterm) covers the foundations:

  • 1  Introduction: We begin with the central question—does correlation imply causation?—using the example of health insurance and health outcomes. You’ll see why simple comparisons can be misleading and why we need econometric tools.

  • 2  Good Data Practices: Before we move onto econometric theory, we cover best practices for data best practices and reproducible research in R

  • 3  Causality and Randomized Control Trials: We introduce the “gold standard” of causal inference—Randomized Control Trials (RCTs)—and the Potential Outcomes Framework, which provides the mathematical foundation for thinking about causality.

  • 4  Bivariate Regression and Ordinary Least Squares: We develop bivariate OLS regression, deriving the estimator, understanding when it gives us causal estimates, and diagnosing when it fails due to omitted variable bias.

  • 5  Multivariate Regression: We extend OLS to multiple independent variables, learning how to “control for” confounders and interpret coefficients in the multivariate setting.

  • 6  Statistical Inference: We tackle statistical inference—hypothesis testing, t-statistics, p-values, and confidence intervals—learning to distinguish real effects from statistical noise.

Statistical Software

Modern econometrics is done with statistical software that you interact with through code. In this course, we use R, a free and open-source programming language widely used in academia, government, and industry.

Why R? Three reasons:

  1. It’s free. Unlike Stata or SAS, you can install R on any computer and use it forever at no cost.

  2. It’s powerful. R can handle everything from simple regressions to cutting-edge machine learning, beautiful data visualizations, and reproducible research documents (these notes were entirely written in R!).

  3. It’s in demand. R skills are valued by employers in consulting, tech, finance, public policy, and academic research. It regularly ranks among the most widely used programming languages.

Don’t worry if you’ve never programmed before. We’ll start from the basics. See Appendix A — Intro to R and RStudio for instructions on getting set up.

A Note on Learning Econometrics

Econometrics is challenging. It combines economic intuition, statistical theory, and programming skills, often all at once. If you find yourself struggling (and you will), that’s normal. Here’s my advice:

  • Run the code yourself. Don’t just read the examples—type them out, run them, break them, fix them. You learn programming by doing.

  • Focus on intuition first. Before memorizing formulas, make sure you understand why a method works.

  • Ask questions. Come to office hours, post on the discussion board, work with classmates.

  • Be patient with yourself. The skills you’re learning will serve you well beyond this course, whether you pursue graduate study, policy work, or a career in the private sector.

About the Instructor

I am a recent PhD from UMass Amherst. My research focuses on the relationship between public policy and maternal, infant, and child health, with particular attention to racial and ethnic health disparities. In my work, I use many of the methods you’ll learn in this course.

You can find out more about me and my research at www.raymondcaraher.com.


These notes are a work in progress. If you spot errors or have suggestions for improvement, please let me know!