Constructing a Temporal Reservoir Machine hierarchy

The code for this example can be found at examples/trm_hierarchy_demo.py

In this example, a hierarchy of conditional Restricted Boltzmann machines (or Temporal Reservoir Machines) is constructed and trained on some trivial toy task. Between every layer, Principal Component Analysis (PCA) is applied to the data that is fed to the reservoir in the next layer. It demonstrates how the MDP/Oger functionality can be used to construct complex hierarchical models for temporal data.

As shown below, the architecture consists of six layers. The first layer contains both a ShiftNode that shifts data to the left or to the right over time and a ReservoirNode. The output of this layer is concatenated and fed to the CRBMNode that uses the reservoir's output as context data to model the data that arrives through the shift node. The activation patterns of the hidden units of the CRBM are fed to both another ShiftNode and a PCA node. In the next layer, the Identity node receives the shifted data while the reservoir receives the PCA compressed data. The output of these two nodes is again fed to a CRBM layer in the same way as before. Finally, the activation patterns of the hidden units of this second CRBM are fed to a node that performs ridgeregression.

Architecture of the hierarchy:
ShiftNode ReservoirNode
CRBMNode
ShiftNode PCANode
IdentityNode ReservoirNode
CRBMNode
RidgeRegressionNode

 

Constructing and training this architecture is done using the following steps:

First, the components of the first layer are constructed and combined in a layer that makes sure they receive the same input.

reservoir1 = Oger.nodes.ReservoirNode(input_dim=20, output_dim=300)
shift1 = Oger.nodes.ShiftNode(input_dim=20, n_shifts=1)
ReservoirLayer1 = mdp.hinet.SameInputLayer([shift1, reservoir1]) 

Second, the data is passed through this first layer, a CRBM is trained on it and added to the architecture's flow.

x = ReservoirLayer1(u)
    
crbmnode1 = Oger.nodes.CRBMNode(hidden_dim=crbm1_size, visible_dim=20, context_dim=300)

for epoch in range(epochs):
    for i in range(len(x) - 1):
        crbmnode1.train(x[i:i + 1, :], epsilon=.001, decay=.0002)

theflow = ReservoirLayer1 + crbmnode1

The next layer is again a SameInputLayer but now contains a shift node and a PCA node. The layer with a reservoir that follows the PCA layer, is constructed using a normal hinet layer so the input of the IdentityNode and the reservoir is not the same.

ReservoirLayer2 = mdp.hinet.Layer([identity, reservoir2])

A second CRBM is trained on the output of this last layer and added to the architecture's flow as well. Finally, a RidgeRegression node is trained and added to the flow.

readout = Oger.nodes.RidgeRegressionNode(ridge_param=.0000001,
                                           input_dim=crbm2_size, output_dim=20)
readout.train(x, t)
readout.stop_training()

theflow += readout
theflow = mdp.hinet.FlowNode(theflow)

Passing data (in this case the variable 'u') through the architecture to obtain predictions is now very simple.

y = theflow(u)