Prerequisites
UtilsRL verison when proposing this request
0.3.13
What I am expecting
Full support for RNN backend. Currently (v0.3.13) in UtilsRL, actors and critics are fed with backend which is typed as nn.Module, and theoretically this can decouple feature extraction and decision making. However, when using RNN to extract features, actors/critics often require extra inputs apart from the RNN's output. For example, in MAPLE, RNN output $o_t$ is computed with previous state-actions and the actor needs both $s_t$ and $o_t$ as input.
We need to unify the design when the type of backend varies among (possibly) MLP, CNN, RNN.
Possible solutions
- check the output dimension of
backend and compare it against the input of output_layer. If these two don't match, we concatenate extra inputs (such as $s_t$) to the output of backend along dimension -1.
- create another class like RecurrentActor and RecurrentCritic, like what tianshou did.
Any additional messages which might help
No response
Urgency
Urgent, will bring significant improvement and should be considered in next main version.
Prerequisites
UtilsRL verison when proposing this request
0.3.13
What I am expecting
Full support for RNN backend. Currently (v0.3.13) in UtilsRL, actors and critics are fed with$o_t$ is computed with previous state-actions and the actor needs both $s_t$ and $o_t$ as input.
backendwhich is typed asnn.Module, and theoretically this can decouple feature extraction and decision making. However, when using RNN to extract features, actors/critics often require extra inputs apart from the RNN's output. For example, in MAPLE, RNN outputWe need to unify the design when the type of
backendvaries among (possibly) MLP, CNN, RNN.Possible solutions
backendand compare it against the input ofoutput_layer. If these two don't match, we concatenate extra inputs (such asbackendalong dimension -1.Any additional messages which might help
No response
Urgency
Urgent, will bring significant improvement and should be considered in next main version.