I recently ran into a situation where I had to initially train a neural network first on one dataset, save it and then load it up later to train it on a different dataset (or using a different training procedure). I implemented this in Tensorflow and thought I’d share a stripped down version of the script here as it could serve as an instructive example on the use of Tensorflow sessions. Note that this is not necessarily the best way of doing this, and it might indeed be simpler to load the original graph and train that graph itself by making its parameters trainable, or something else like that.
The script can be found here. In the first stage of this script (the pre-training stage) there is only a single graph which contains the randomly initialised and trained model. One might as well avoid explicitly defining a graph as Tensorflow’s default graph will be used for this purpose. This model (together with its parameters) is saved to a file and then loaded for the second re-training stage. In this second stage, there are two graphs. The first graph is loaded from the saved file and contains the pre-trained model whose parameters are the ones whose values we wish to assign to those of the second model before training the latter on a different dataset. The parameters of the second model are randomly initialised prior to this assignment step. In order for the assignment to work, I found it necessary to assign parameters across graphs and this could be done by saving the parameters of the first model as
numpy tensors and assigning the values of these
numpy tensors to the right parameters of the second model.