6.7930[6.871]/HST.956: Machine Learning for Healthcare

Graduate level; Units 4-0-8 (counts as an AUS2, AI+D_AUS, AAGS, and II subject; also a EECS AI TQE)
Instructors: Peter Szolovits, David Sontag
Teaching Assistants: Ilker Demirel, Sophie Guo
Lectures: Tuesdays & Thursdays, 2:30-4:00pm Eastern Time, 4-270
Recitations (required): Friday, 3:00-4:00pm, 4-270
Prerequisites: 6.3900[6.036] or 6.7900[6.867] or 9.520/6.7910[6.860] or 6.8611/6.8610[6.806/6.864] or 6.4102/6.4100[6.438/6.034] or equivalent machine learning class. (Subscripted bracketed numbers are the class numbers before the recent mass renumbering of all EECS classes.)

Office Hours:
When
Where
Who
Tues 4:30-5:30
26-314
Sophie
Wed 12:00-1:00
26-314
Ilker

Contact staff: mlhc25@mit.edu

Announcements

Course description

Introduces students to machine learning in healthcare, including the nature of clinical data and the use of machine learning for risk stratification, disease progression modeling, precision medicine, diagnosis, subtype discovery, and improving clinical workflows. Topics include large language models, causality, interpretability, algorithmic fairness, time-series analysis, graphical models, deep learning, transfer learning, genomics, and computational biology. Guest lectures by clinicians from the Boston area and course projects with real clinical data emphasize subtleties of working with clinical data and translating machine learning into clinical practice.


Schedule

The schedule for classes is under revision and is still in draft form.



Class Date Lecture & Materials Assignments
Overview of Clinical
Care & Data
1 Tue, Feb 4
Introduction: What makes healthcare unique?
Reading (Due Fri, 2/7 1:00pm ET):
Helpful optional readings:
PS 0 out
2
Thu, Feb 6
Overview of Clinical Care
Reading (Due Fri, 2/7 1:00pm ET):

3
Tue, Feb 11
Overview of Clinical Data Science
Reading:
PS 1 out
ML with Clinical Text,
Imaging, Physiological,
and Administrative Data
4
Thu, Feb 13
ML for Risk Stratification: focus on structured EMR data
Reading (Due Fri 1pm ET):


Tue, Feb 18
No class -- Monday schedule of classes  
5 Thu, Feb 20
Risk Stratification and Physiological Time-Series
Reading:

6 Tue, Feb 25
LLMs 1: differential diagnosis, question answering, treatment planning
Reading:

7
Thu, Feb 27
LLMs 2: information extraction and summarization
Required Reading:
Optional Reading:
PS 2 out
8
Tue, Mar 4

9
Thu, Mar 6

Causal Inference
10
Tue, Mar 11
Causal inference 1: Causal graphs, potential outcomes, covariate adjustment
Reading:
Optional Reading:

11
Thu, Mar 13
Causal inference 2: Assumptions for causal inference, inverse propensity weighting
Reading:
Optional Reading:

PS 3 out
12
Tue, Mar 18
Causal inference 3: Policy learning and dynamic treatment regimes
Reading:
Optional Reading:

13
Thu, Mar 20
Dataset and temporal shift: detection and mitigation
Reading:

Real World Deployment Challenges

Mar 24-28
Spring Break -- No classes

14
Tue, Apr 1
Multi-modal modeling of text and imaging data

15
Thu, Apr 3 Guest lecture: Faisal Mahmood (Computational Pathology)

16
Tue, Apr 8
Interpretability & explainability
PS4 out
17
Thu, Apr 10
Regulation of AI in healthcare

18
Tue, Apr 15
Human-AI collaboration in decision making

19
Thu, Apr 17
TBD (industry perspective)

20
Tue, Apr 22
Privacy: differential privacy, federated learning, synthetic data

21
Thu, Apr 24
TBD (industry perspective)

22
Tue, Apr 29
Disease subtyping & progression modeling

23
Thu, May 1
Bias and its prevention

24
Tue, May 6
Guest Lecture: Jim Collins

25
Thu, May 8
Omics:TBD

26
Tue, May 13
Student Project Presentations



Final Exam

Reading Assignments

Many of the lectures are associated with related papers that should help you think about the lecture topic. For each reading assignment, you are expected to submit a brief summary of the three most important ideas of the paper, as short bullet points.

Problem sets

The problem sets pdfs will be available here (not that some data for the problem sets is not publicly available):

Projects

We are recruiting a group of doctors with interesting clinical problems to mentor teams of students who will work on them. We will form teams and match to problems/mentors a few weeks into the class. We will release the project guidelines and detailed information in Canvas.

Grading

Late Policy (starting for pset1 onwards)

Scenarios:

Use of Generative AI

You may choose to use generative AI tools to help think through problems, but not to produce solutions to the problem sets. In your problem set solutions, you must indicate any external resources consulted, including Generative AI. The course staff reserves the right to ask you to explain the rationale for any answer if it appears that it is not original to you. Pedagogically, you will learn much more from working through any problem on your own, and this will be reflected in your final exam scores, which will be closed-book and without access to any Generative AI.


Prior years of this course