6.S897/HST.956: Machine Learning for Healthcare

Instructors: David Sontag, Peter Szolovits
Teaching Assistants: Willie Boag, Irene Chen (Office Hours: Monday 1pm, 32-G 9th floor lounge)
Graduate level; Units 3-0-9 (counts as an AAGS subject)
Time: Tuesdays & Thursdays, 2:30-4pm
Location: 4-270
Prerequisite: 6.036/6.862 or 6.867 or 9.520/6.860 or 6.806/6.864 or 6.438 or 6.034
Recitations (optional): Fridays at 2pm (4-153)
Contact: Piazza
Stellar page: https://stellar.mit.edu/S/course/HST/sp19/HST.956/

Course Description | Schedule | Prerequisite quiz | Grading | Problem sets | Lecture scribes | MLHC Community Consulting | Final projects | Collaboration Policy | Problem Set Late Policy

Course description

Introduces students to machine learning in healthcare, including the nature of clinical data and the use of machine learning for risk stratification, disease progression modeling, precision medicine, diagnosis, subtype discovery, and improving clinical workflows. Topics include causality, interpretability, algorithmic fairness, time-series analysis, graphical models, deep learning and transfer learning. Guest lectures by clinicians from the Boston area and course projects with real clinical data emphasize subtleties of working with clinical data and translating machine learning into clinical practice.

Note that because of high demand, we do not have space for listeners.


Schedule is subject to change.

Class Date Lecture Materials Assignments
1 Tues Feb 05
Introduction: What makes healthcare unique?
2 Thurs Feb 07
Overview of clinical care
3 Tues Feb 12
Deep dive into clinical data
4 Thurs Feb 14 Risk stratification using EHRs and insurance claims
(Discussant: Leonard D'Avolio)
Reflection questions
Tues Feb 19 - President's Day, Monday schedule
5 Thurs Feb 21 Survival modeling
6 Tues Feb 26 Physiological time-series
7 Thurs Feb 28 Clinical text part 1 (Discussant: Katherine Liao) Reflection questions
8 Tues Mar 05 Clinical text part 2
9 Thurs Mar 07 Translating technology into the clinic (Discussant: Adam Wright)
10 Tues Mar 12 Machine learning for cardiology (Guest lecture: Rahul Deo)
11 Thurs Mar 14 Machine learning for differential diagnosis Reflection questions
12 Tues Mar 19 Machine learning for pathology (Guest lecture: Andy Beck)
13 Thurs Mar 21 Machine learning for mammography
(Guest lecture: Connie Lehman, Adam Yala)
Tues Mar 26 & Thurs Mar 28 - Spring vacation
14 Tues Apr 02 Causal inference part 1
15 Thurs Apr 04 Causal inference part 2 Midsemester feedback
16 Tues Apr 09 Reinforcement learning part 1 (Guest lecture: Fredrik Johansson) Reading questions
17 Thurs Apr 11 Reinforcement learning part 2 (Guest lecture: Barbra Dickerman)
Tues Apr 16 - Patriots Day holiday
18 Thurs Apr 18 Disease progression & subtyping part 1 Pset6 out
19 Tues Apr 23 Disease progression & subtyping part 2
20 Thurs Apr 25 Precision medicine Pset6 due
21 Tues Apr 30 Automating clinical workflows
22 Thurs May 02 Regulation of ML/AI in the US
(Guest lecture: Andy Coravos, Mark Shervey)
Reading questions
23 Tues May 07 Fairness
24 Thurs May 09 Robustness to dataset shift
Tues May 14 No class Project poster presentations (evening)
25 Thurs May 16 Interpretability
  • [Slides] Lecture 25
  • [optional] "Why Should I Trust You?": Explaining the
    Predictions of Any Classifier
  • [optional] Falling Rule Lists
  • Projects due

    Prerequisite quiz

    This quiz will not count toward your grade, but will be used by the course staff to check prerequisites (6.036/6.862 or 6.867 or 9.520/6.860 or 6.806/6.864 or 6.438 or 6.034) and to assess students' preparation for this class.

    The prerequisite quiz is now closed, but you can view the questions here.


    Problem sets

    We expect there will be seven problem sets this year.

    Lecture scribes

    Each student is expected to either “scribe” for one lecture or "consult" for one MLHC community evening session (see below). A given lecture will have 1-2 scribes who are responsible for summarizing what was discussed in class. The first draft of the notes should be submitted to the TAs by 11:59pm of the day after class (i.e. 30 hours after lecture ends). We will send you suggestions to revise, and once the notes are finalized, we will then post it on the course website. The goal will be to get the notes out by one week after the corresponding class.

    We expect writing up lecture notes to take no more than 3 hours. If there are two scribes for one lecture, the two scribes should collaborate and submit one writeup. The notes you write should cover all the material covered during the relevant lecture, plus real references to the papers containing the covered material. Your notes should be understandable to someone who has not been to the lecture. You should write in full sentences where appropriate; point form is often too terse to follow without a sound track (though occasionally it is appropriate). Use numbered sections, subsections, etc. to organize the material hierarchically and with meaningful titles. Try to preserve the motivation, difficulties, solution ideas, failed attempts, and partial results obtained along the way in the actual lecture.

    Write your notes using LaTeX. Please use our template -- either through downloading the template or using Overleaf (Menu -> Copy project).

    MLHC Community Consulting

    Each student is expected to either “scribe” for one lecture (see above) or "consult" for one Machine Learning for Healthcare (MLHC) community evening session. Throughout the semester, we will organize four evening sessions to engage with the larger MLHC community. Clinicians and other Boston area people interested in machine learning for healthcare will come to talk through their problems and ideas.

    MLHC Community Consulting for this semester will occur:

    Clinicians are welcome to
    sign-up here for more information, or see our poster.

    Students who sign up for community consulting will be expected to attend the entire session and submit a write-up of their experiences shortly after the session. We expected one write-up per clinician, so students should coordinate if they talked to the same clinician. Write-ups are due one week after the consulting session.


    Projects will include a proposal, poster presentation, and final report. We will add more information here shortly.

    Collaboration Policy

    Students must write up their problem sets individually. Students should not share their code or solutions (i.e., the write up) with anyone inside or outside of the class, nor should it be posted publicly to GitHub or any other website. You are asked on problem sets to identify your collaborators. If you did not discuss the problem set with anyone, you should write "Collaborators: none." If in writing up your solution you make use of any external reference (e.g. a paper, Wikipedia, a website), both acknowledge your source and write up the solution in your own words. It is a violation of this policy to submit a problem solution that you cannot orally explain to a member of the course staff.

    Plagiarism and other dishonest behavior cannot be tolerated in any academic environment that prides itself on individual accomplishment. If you have any questions about the collaboration policy, or if you feel that you may have violated the policy, please talk to one of the course staff.

    Problem Set Late Policy

    (starting for pset2 onwards)