검색결과 : 2건
No. | Article |
---|---|
1 |
GLOBAL CONVERGENCE OF POLICY GRADIENT METHODS TO (ALMOST) LOCALLY OPTIMAL POLICIES Zhang KQ, Koppel A, Zhu H, Basar T SIAM Journal on Control and Optimization, 58(6), 3586, 2020 |
2 |
Natural actor-critic algorithms Bhatnagar S, Sutton RS, Ghavamzadeh M, Lee M Automatica, 45(11), 2471, 2009 |