محمدباقر نقیبی سیستانی

سرفصل مطالب
#	سرفصل مطالب	طرح درس (براساس سرفصل)
1	What is Reinforcement Learning?
2	Introduction
3	Evaluative Feedback
4	The n-Armed Bandit Problem
5	The Reinforcement Learning Problem
6	The reward hypothesis
7	The Markov Property
8	Markov decition process
9	Value Functions
10	Dynamic Programming
11	Policy Iteration
12	value Iteration
13	Asynchronous DP
14	Monte Carlo Methods
15	Random walk problem
16	On-policy and Off-policy
17	Temporal Difference Learning
18	TD-Learning Vs Monte Carlo Learning
19	Sarsa: On-Policy TD Control
20	Q-Learning: Off-Policy TD Control
21	Eligibility Traces
22	TD-Lambda
23	Presentation

دکتر محمدباقر نقیبی سیستانی