Vorlesung, 2 SWS
Praesenzveranstaltung, Unterrichtssprache Englisch
Zeit und Ort: Mi 12:15 - 13:45, HHP6 - R.EG.001; Fr 8:15 - 9:45, HHP6 - R.EG.001
vom 17.4.2024 bis zum 31.5.2024
Reinforcement learning refers to a set of machine learning methods in which future decisions are made based on past successes and failures. It is assumed that the decision maker does not know (exactly) the underlying environment. Such methods play a central role in many modern applications, for example in the training of Google's "AlphaZero". The classic example of the "bandit" problem illustrates the basic issues: You are in a casino and want to choose one of the many slot machines ("one-armed bandits") in each round. However, you do not know the payoff distribution of the machines. In the beginning, you will probably just try the machines ("exploration") and then, after some learning, choose the (apparently) best ones ("exploitation"). However, the problem is that if you use one machine a lot, you may not learn anything about the others, and you may not even find the best machine ("Exploration-Exploitation Dilemma"). So what should you do? In this course, we will introduce basic mathematical ideas and notations, and also examine algorithms for solving them. We consider mathematical methods to describe important concepts in reinforcement learning, such as bandit problems, Markov decision processes, and deep learning.
Empfohlene Literatur
Lecture notes and additional literature will be provided