Teaching
Courses and seminars.
Courses
Summer Semester
Modern LLMs like ChatGPT and LLaMA show impressive performance in a range of real-world applications. Much of this performance is enabled by scaling up the model sizes. State-of-the-art models consist of hundreds of billions of parameters. This massive scale makes training, finetuning and inference with LLMs much more challenging as compared to traditional deep learning models like CNNs and LSTMs. In this course, you will learn technical advances that power modern LLMs. You will learn how to operate massive LLMs that do not fit on a single GPU. You will also learn about techniques like caching, quantization, pruning and parameter efficient finetuning that enable efficient fine-tuning and inference. At the end of the course, you will be able to work with frontier LLMs in a local (laptop or PC), cloud and API-based settings. You will also be able to understand the performance and efficiency tradeoffs offered by various training and inference optimization strategies.
Summer Semester
ML models are increasingly being used in domains where their decisions directly impact people. For instance, ML models are used to diagnose medical conditions, match users with online advertisements, assess creditworthiness, and in some places, even make decisions about pretrial bail. Since these decisions affect real human beings, ensuring that the models decisions are trustworthy is of utmost importance. However, plenty of investigations have shown that these models suffer from various issues. For instance, the models can: unfairly disadvantage people from certain demographics, produce toxic outputs, change their outputs significantly as a result of unimportant changes in the input and violate privacy of their users. Given the complexity of these models, root-causing and alleviating these issues can be quite challenging. In this course, you will learn where these trustworthiness issues originate from, how to detect them, and what actions to take to alleviate them. You will also learn that these issues cannot be solved by technical interventions alone, and that one often needs an interdisciplinary approach to tackle them.
Winter Semester. Co-taught with Robert Schmidt.
Data science is a rapidly developing field with numerous application areas. In this course you will learn basic tools of data science. You will also become familiar with advanced methods involving deep learning and their practical applications. In the first part of the course you will get an introduction to fundamental statistical methods underpinning data science. You will also learn techniques for analyzing and visualizing datasets of different modalities like text, images and tabular. You will dive deep into data-driven prediction methods from machine learning and deep learning. In the final parts of the course we will introduce you to advanced topics, including recent progress in large language modelling and use of data-driven decision making in a trustworthy manner. At the end of this course, you would be familiar with: (i) Key contemporary methods for data-driven prediction; (ii) Methods for processing, exploring and visualizing data of different modalities; and (iii) Building proof-of-concept code bases for solving real-world data science problems.
Winter Semester. Co-taught with Nils Jansen.
Past few years have seen confluence of two related trends: 1/ A rapid adoption of Artificial Intelligence (AI) and Machine Learning (ML) in a wide range of real-world applications across a variety of domains, e.g., healthcare, engineering. 2/ Development of specialized tooling and design patterns for AI/ML workloads. The goal of this course is to introduce the students to various AI/ML prediction paradigms, popular frameworks and design patterns. Specifically, we will build code bases involving (shallow) classification / regression models, CNNs and Transformers using frameworks like scikit-learn, PyTorch and Transformers. We will learn about using data loaders to manage large scale dataset and using GPUs to speed up deep learning workloads. We will also learn about best practices like testing and reproducibility.Seminars
Summer Semester
Large Language Models (LLMs) like ChatGPT show impressive performance in a variety of tasks like conversation, question answering and summarization. However, the outputs of these models cannot always be trusted. For instance, past studies have shown that the model outputs can exhibit stereotypes or biases against certain social groups. The models could also hallucinate facts about the real world. On top of these issues, the models are often unable to provide insights into why they generated a certain output and what the underlying reasoning was. In this seminar, we will aim to get an in-depth understanding of some of the trustworthiness issues surrounding LLMs and potential approaches to address them. We will read papers from AI and NLP conferences like ACL, ICML, NeurIPS and ICLR.
Winter Semester
AI like ChatGPT and LLAMA show impressive performance in a variety of tasks, sometimes even beating humans. The outputs of these models are, however, quite difficult to interpret. Given an input to the AI model and corresponding output, it is difficult for users to understand “why” the model generated this output. Interpretability is a key desideratum for successful adoption of these models in the real world. As a result, a productive field of research has developed in recent years with the goal of trying to understand and explain the outputs of these models. In this seminar, we will read and discuss recent advances in interpretability of AI models. We will read papers from AI conferences like ICML, NeurIPS and ACL.