Model Setup & Training¶
Assuming your data is in the form of
numpy.ndarray stored in the variables
y_train you can train a
sknn.mlp.Regressor neural network. The input and output arrays are continuous values in this case, but it’s best if you normalize or standardize your inputs to the
[-1..1] range. (See the sklearn Pipeline example below.)
from sknn.mlp import Regressor, Layer nn = Regressor( layers=[ Layer("Rectifier", units=100), Layer("Linear")], learning_rate=0.02, n_iter=10) nn.fit(X_train, y_train)
This will train the regressor for 10 epochs (specified via the
n_iter parameter). The
layers parameter specifies how the neural network is structured; see the
sknn.mlp.Layer documentation for supported layer types and parameters.
Then you can use the trained NN as follows:
y_example = nn.predict(X_example)
This will return a new
numpy.ndarray with the results of the feed-forward simulation of the network and the estimates given the input features.
If your data in
numpy.ndarray contains integer labels as outputs and you want to train a neural network to classify the data, use the following snippet:
from sknn.mlp import Classifier, Layer nn = Classifier( layers=[ Layer("Maxout", units=100, pieces=2), Layer("Softmax")], learning_rate=0.001, n_iter=25) nn.fit(X_train, y_train)
It’s also a good idea to normalize or standardize your data in this case too, for example using a sklearn Pipeline below. The code here will train for 25 iterations. Note that a
Softmax output layer activation type is used here, and it’s recommended as a default for classification problems.
If you want to do multi-label classification, simply fit using a
y array of integers that has multiple dimensions, e.g. shape (N, 3) for three different classes. Then, make sure the last layer is
y_example = nn.predict(X_example)
This code will run the classification with the neural network, and return a list of labels predicted for each of the example inputs. If you need to access the probabilities for the predictions, use
predict_proba() and see the content of the
classes_ property that provides the labels for each features, which you can use to compute the probability indices.
Working with images as inputs in 2D (as greyscale) or 3D (as RGB) images stored in
numpy.ndarray, you can use convolution to train a neural network with shared weights. Here’s an example how classification would work:
from sknn.mlp import Classifier, Convolution, Layer nn = Classifier( layers=[ Convolution("Rectifier", channels=8, kernel_shape=(3,3)), Layer("Softmax")], learning_rate=0.02, n_iter=5) nn.fit(X_train, y_train)
The neural network here is trained with eight kernels of shared weights in a
3x3 matrix, each outputting to its own channel. The rest of the code remains the same, but see the
sknn.mlp.Layer documentation for supported convolution layer types and parameters.
When training a classifier with data that has unbalanced labels, it’s useful to adjust the weight of the different training samples to prevent bias. This is achieved via a feature called masking. You can specify the weights of each training sample when calling the
w_train = numpy.array((X_train.shape,)) w_train[y_train == 0] = 1.2 w_train[y_train == 1] = 0.8 nn.fit(X_train, y_train, w_train)
In this case, there are two classes
0 given weight
1 with weighting
0.8. This feature also works for regressors as well.
Native & Custom Layers¶
In case you want to use more advanced features not directly supported by
scikit-neuralnetwork, you can use so-called
sknn.nn.Native layers that are handled directly by the backend. This allows you to use all features from the Lasagne library, for example.
from lasagne import layers as lasagne, nonlinearities as nl from sknn.mlp import Classifier, Layer, Native nn = Classifier(layers=[ Native(lasagne.DenseLayer, num_units=256, nonlinearity=nl.leaky_rectify), Layer("Linear")])
When you insert a
Native specification into the
layers list, the first parameter is a constructor or class type that builds an object to insert into the network. In the example above, it’s a
lasagne.layers.DenseLayer. The keyword parameters (e.g.
nonlinearity) are passed to this constructor dynamically when the network is initialized.
You can use this feature to implement recurrent layers like LSTM or GRU, and any other features not directly supported. Keep in mind that this may affect compatibility in future releases, and also may expose edge cases in the code (e.g. serialization, determinism).