Oger tutorial

strict warning: Only variables should be passed by reference in /var/www/html/drupal-6.31/modules/book/book.module on line 559.

Overview

Oger is a Python toolbox for rapidly building, training and evaluating hierarchical learning architectures on large datasets. It builds functionality on top of the Modular toolkit for Data Processing (MDP). Please read the MDP tutorial before continuing here. 

In MDP, the central concept is known as a node, which is an elementary machine-learning or signal processing block. MDP includes a wide variety of algorithms out of the box (see the full list here). Nodes can be trainable - both unsupervisedly (e.g. clustering) and supervisedly (e.g. classification or regression) - or not (e.g. filters, preprocessing). These nodes can then be used to construct graph-like architectures by concatenating them into what is called a flow.   

Oger packages many machine learning algorithms, but the main focus is around sequence processing algorithms. Particularly, many methods from the field of  Reservoir Computing are available in the toolbox. Reservoir computing is a general computational framework, whereby a non-linear dynamical network of nodes (such as a recurrent neural network) called the reservoir is randomly created and left untrained. The response of the reservoir to the input is then used to train a simple algorithm (usually a linear method) to produce the desired output.

Getting started

Using Oger in your code is as simple as: 

import Oger

This loads all methods and Oger nodes into the namespace Oger. The functions and classes further split into these subpackages:

  • Oger.datasets: several common benchmark datasets.
  • Oger.evaluation: evaluation and optimization of flows.
  • Oger.nodes: learning algorithms and signal processing nodes.
  • Oger.gradient: gradient descent training.
  • Oger.parallel: parallellization.
  • Oger.utils: utility functions and classes.

The usual experiment consists of generating the dataset, constructing your flow (learning architecture) by concatenating nodes into a feedforward graph-like structure, and then simply training and appliying it or performing some optimization or parameter sweeps.

A simple experiment

You can create a node by simpy instantiating it. For example:

resnode = Oger.nodes.ReservoirNode(output_dim = 100) 

This creates a reservoir node of 100 neurons. Let's create a ridge regression readout node:

readoutnode = Oger.nodes.RidgeRegressionNode()

Flows can be easily created from nodes as follows:

flow = resnode + readoutnode 

For more examples on how to construct flows, including more complex architectures, please see the MDP tutorial. Next, we can create data from one of the built-in datasets, a 30th order nonlinear autoregressive moving average system (NARMA). The input to the system is uniform white noise, the output is given by the NARMA system. 

x,y = Oger.datasets.narma30()  

By default, this returns two lists of ten 1D timeseries, the input and corresponding output, of 1000 timesteps each. Please see the API documentation for the arguments to this and other functions. 

Flows are trained in a feedforward way, node by node starting from the front. In our case, the first node (the reservoir) is not trainable, so we don't need to provide data for that. To train the second node (the readout), we provide a list of input-output tuples, conveniently generated by the zip function. We will train on the first nine timeseries, and keep the last one separate for testing later on. This gives the following dataset usable for training the flow:

data = [None, zip(x[0:-1],y[0:-1])]

We can now train the flow to reproduce the output given the input like so:

flow.train(data)  

 We can now see how our trained architecture performs on unseen data:

plot(flow(x[-1]))
plot(y[-1])  

 You should see a more or less OK match, but still not quite good. This is because we took the default parameters for the reservoir. The scaling of the input weights (given by the argument input_scaling) is an important parameter, let's see if we can optimize this. We define the range over which we want to scan this parameter as follows:

gridsearch_parameters = {resnode:{'input_scaling': mdp.numx.arange(0.1, 0.5, 0.1)}}

This is a dictionary, where the keys are nodes in the flow, and the values are again dictionaries. These internal dictionaries have parameter names as keys, and iterables as values. So in this case, we want to scan the input_scaling parameter of our resnode over the values .1 to .5 in steps of .1.

Let's try to optimize them using the Optimizer class. When creating an optimizer, we need to tell it what parameter space we want to explore (as defined above), as well as the loss function. For this problem the normalized RMSE is suitable. This gives:

opt = Oger.evaluation.Optimizer(gridsearch_parameters, Oger.utils.nrmse)

We can now optimize this flow on the given dataset, using 5-fold cross-validation and a brute-force gridsearch of the (small) parameter space.

opt.grid_search(data, flow, cross_validate_function=Oger.evaluation.n_fold_random, n_folds=5)