LV-short

Informationssystem der Universität Kiel

Semester: SS 2024

Mathematics of Reinforcement Learning (MathReinf) (060329)

Dozent/in

Prof. Dr. Sören Christensen

Angaben

Vorlesung, 2 SWS
Praesenzveranstaltung, Unterrichtssprache Englisch
Zeit und Ort: Mi 12:15 - 13:45, HHP6 - R.EG.001; Fr 8:15 - 9:45, HHP6 - R.EG.001
vom 17.4.2024 bis zum 31.5.2024

Voraussetzungen / Organisatorisches

In addition to the basics of stochastics, no special prior knowledge is required.
Modulhandbuch:
https://www.math.uni-kiel.de/de/studium_und_lehre/studienverlauf-module
Modulcode: mathAKdS5-01a:
https://www.math.uni-kiel.de/de/studium_und_lehre/studienverlauf-module/module/mathAKdS5-01a.pdf
Modulcode: mathAKdF-01a:
https://www.math.uni-kiel.de/de/studium_und_lehre/studienverlauf-module/module/mathAKdF-01a,.pdf
Modulcode: mathAKdW-01a:
https://www.math.uni-kiel.de/de/studium_und_lehre/studienverlauf-module/module/mathAKdW-01a.pdf
Modulcode: mathAKaNuF-01a, 5 LP:
https://www.math.uni-kiel.de/de/studium_und_lehre/studienverlauf-module/module/mathAKaNuF-01a.pdf
Zielgruppe:
1-Fach-Master Mathematik
2-Fach-Master Mathematik
1-Fach Master Finanzmathematik
Link auf Internetseite:
https://lms.uni-kiel.de/url/RepositoryEntry/5434671278

Inhalt

Reinforcement learning refers to a set of machine learning methods in which future decisions are made based on past successes and failures. It is assumed that the decision maker does not know (exactly) the underlying environment. Such methods play a central role in many modern applications, for example in the training of Google's "AlphaZero". The classic example of the "bandit" problem illustrates the basic issues: You are in a casino and want to choose one of the many slot machines ("one-armed bandits") in each round. However, you do not know the payoff distribution of the machines. In the beginning, you will probably just try the machines ("exploration") and then, after some learning, choose the (apparently) best ones ("exploitation"). However, the problem is that if you use one machine a lot, you may not learn anything about the others, and you may not even find the best machine ("Exploration-Exploitation Dilemma"). So what should you do? In this course, we will introduce basic mathematical ideas and notations, and also examine algorithms for solving them. We consider mathematical methods to describe important concepts in reinforcement learning, such as bandit problems, Markov decision processes, and deep learning.

Empfohlene Literatur

Lecture notes and additional literature will be provided

Zusätzliche Informationen

Erwartete Teilnehmerzahl: 15

Zugeordnete Lehrveranstaltungen

UE: Übung zu Mathematics of Reinforcement Learning (060063): Dozent/in: Prof. Dr. Sören Christensen
Zeit und Ort: n.V.

UnivIS ist ein Produkt der Config eG, Röttenbach

OLAT Login

Die aktuellen Vorlesungskripte finden Sie nur auf OLAT, der zentralen Lernplattform der CAU. ...mehr

Mathematics of Reinforcement Learning (MathReinf) (060329)

e-learning portal