CME 296 - Diffusion & Large Vision Models

This course explores diffusion-based generative models for vision. You will study the foundations of diffusion, score matching and flow matching, modern architectures such as U-Nets and Diffusion Transformers, and methods for controllable image generation and evaluation. The course combines theory with practical insights into state-of-the-art generative models. Ideal for students with a background in linear algebra, probability, calculus and machine learning.

Syllabus

Cheatsheet

Canvas

Course staff

Afshine Amidi
Instructor

Shervine Amidi
Instructor

General information

Class communication primarily happens on Canvas > Ed Discussion.
Course content is listed in the syllabus and will be updated as the quarter goes.
A public-facing cheatsheet summarizes in a concise and illustrated way the main concepts covered in the class.
For general inquiries, please contact cme296-spr2526-staff@lists.stanford.edu.

Course characteristics

In-person lectures on Fridays 3:30pm - 5:20pm in Thornton 110.
Class is recorded.
No homework. However, there are two exams: a midterm and a final.

CME 296 - Diffusion & Large Vision Models

Instructors

Logistics

Course staff

General information

Course characteristics