Learn to build and understand language models from fundamental principles, covering modern architectures and practical applications.
Overview
This comprehensive Stanford CS336 course delves into modern language modeling from scratch. Students will learn the fundamental building blocks of large language models, including data preparation, tokenization, Transformer architectures, attention mechanisms, and various pre-training strategies like masked and causal language modeling. The curriculum also covers advanced topics such as fine-tuning, instruction tuning, Reinforcement Learning from Human Feedback (RLHF), model alignment, safety considerations, and robust evaluation techniques. Designed for advanced students with a strong background in machine learning, the course emphasizes both theoretical understanding and practical application, preparing participants to develop and critically analyze cutting-edge LLMs.
Instructor
Danqi Chen
Assistant Professor of Computer Science at Stanford University
Danqi Chen is an Assistant Professor of Computer Science at Stanford University, focusing on natural language processing and machine learning research.
Learning Outcomes
Build modern language models from fundamental principles.
Understand Transformer architectures and attention mechanisms.
Master pre-training techniques like masked and causal language modeling.
Apply fine-tuning, instruction tuning, and RLHF for model alignment.
Evaluate language models for performance, safety, and bias.
Critically analyze the scaling laws and future directions of LLMs.