reinforcement-learning
Here are 7,958 public repositories matching this topic...
-
Updated
Apr 7, 2022
-
Updated
Apr 9, 2022 - C#
-
Updated
Apr 15, 2022 - Python
-
Updated
Oct 28, 2021 - Python
-
Updated
Jan 8, 2022 - Python
-
Updated
Apr 10, 2022 - HTML
-
Updated
Apr 15, 2022 - C++
-
Updated
Jan 15, 2021 - Jupyter Notebook
-
Updated
Jan 29, 2022 - Python
-
Updated
Nov 1, 2020 - Python
-
Updated
Feb 15, 2022 - Python
Bidirectional RNN
Is there a way to train a bidirectional RNN (like LSTM or GRU) on trax nowadays?
-
Updated
Apr 15, 2022 - Jupyter Notebook
-
Updated
Feb 3, 2022
-
Updated
Jan 20, 2022 - Jupyter Notebook
-
Updated
Mar 24, 2022 - Python
-
Updated
Apr 15, 2022 - Jupyter Notebook
-
Updated
Feb 9, 2022 - Python
-
Updated
Jan 5, 2022 - Python
-
Updated
Apr 10, 2022 - Jupyter Notebook
-
Updated
Mar 21, 2022
-
Updated
Dec 14, 2019 - Jupyter Notebook
-
Updated
May 7, 2021 - JavaScript
-
Updated
Apr 4, 2022 - Jupyter Notebook
-
Updated
Feb 15, 2022 - Jupyter Notebook
-
Updated
Apr 16, 2022 - Python
-
Updated
Mar 18, 2022
The following applies to DDPG and TD3, and possibly other models. The following libraries were installed in a virtual environment:
numpy==1.16.4
stable-baselines==2.10.0
gym==0.14.0
tensorflow==1.14.0
Episode rewards do not seem to be updated in model.learn()
before callback.on_step()
. Depending on which callback.locals
variable is used, this means that:
- episode rewards may n
Improve this page
Add a description, image, and links to the reinforcement-learning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the reinforcement-learning topic, visit your repo's landing page and select "manage topics."
Description
There are multiple user requests of using GraphNN data (node and edge lists) as sample batches into a custom RLlib model.
https://discuss.ray.io/t/rllib-variable-length-observation-spaces-without-padding/726
https://discuss.ray.io/t/working-with-graph-neural-networks-varying-state-space/5730/2
The recommended method today is to use Repeated observation space and VariableVal