Quantcast
Channel: Why does REINFORCE perform badly at first in Sutton and Barto Figure 13.1? - Artificial Intelligence Stack Exchange
Browsing all 2 articles
Browse latest View live

Answer by gwtw14 for Why does REINFORCE perform badly at first in Sutton and...

I'm actually working on this example too, implemented the REINFORCE algorithm, and got the same result as you. My only guess is that the authors chose a different initial $\theta$ value to show the...

View Article



Image may be NSFW.
Clik here to view.

Why does REINFORCE perform badly at first in Sutton and Barto Figure 13.1?

In Sutton and Barto (PDF, page 265), 2nd edition, Figure 13.1 applies REINFORCE to the "short corridor with switched actions" environment from Example 13.1. The figure looks like this:My question is,...

View Article
Browsing all 2 articles
Browse latest View live




Latest Images