JDG Lab Notebook

Running TVB simulations on the grid

Overview

The following are notes on using the TVB scripting interface in conjunction with a cluster or grid computing setup. Note that this is independent of the cluster distribution solutions that are available as part of TVB web interface (i.e. the GUI); and I am not aiming to make any direct comparisons between the two.

The following considerations are addressed:

  1. Installation of TVB on compute nodes
  2. Programmatic production of job files a given set of varied parameters
  3. Job submission and recovery of outputs

Obviously, each of these represents a specific technical solution to the specific use cases and IT setup I am currently working with. Hopefully some if it may be useful and generalizable. Other people in other places (particularly Marseille and Berlin) have of course developed their own solutions to similar problems; I have not yet done any comparisons, as I don't think it's really necessary. But certainly there will be room for improvement in terms of efficiency, performance, simplicity, and elegance. This is the first time I have tried to scale up the level of parallelization of my TVB simulations beyond multi-thread and multi-cores on a single machine.

Setting up TVB on Centos 6.7

Summary

Normally, and by preference, I work in Ubuntu or Debian environments. However the SGE grid at Baycrest consists of ~90 compute nodes running Redhat or Centos. TVB isn't officially tested for Redhat or Centos, but things seem to be working ok so far.

The setup procedure I have come up with involves a fresh Anaconda (miniconda) Python 2.7 installation, followed by installing the requisite python libraries for TVB with a combination of conda and pip package managers. The TVB external geodesic library is then installed from source (this is optional unless one is running surface simulations), and then finally tvb-library and tvb-data simply need to be added to the user's PYTHONPATH; they don't need to be actually installed.

Install Anaconda (miniconda)

Download the installer script:

wget https://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh

Install to /usr/local

sh Miniconda-latest-Linux-x86_64.sh -b -p /usr/local/miniconda

Add to PATH to make this python installation the preferred one on the current session

export PATH=/usr/local/miniconda/bin:$PATH

Install pip

conda install pip

Install TVB Dependencies

The tvb dependencies listed in the mkenv file in the scripts folder of the tvb-pack github repo are:

cherrypy, formencode sqlalchemy sqlalchemy-migrate genshi simplejson cfflib networkx nibabel apscheduler mod_pywebsocket psutil minixsv h5py BeautifulSoup mplh5canvas

Unfortunately we can't get all of these installed with either pip or conda. In general, conda will cope better with core low-level numerical libraries, but it doesn't have as wide a range of packaged libraries as pip. Things are less likely to break in an anaconda installation if conda is used than pip. So the approach here is to use conda as much as possible, but pip when the library isn't available from conda.

These are the conda libraries:

conda install cherrypy simplejson networkx psutil h5py matplotlib

...to which I have the following additions:

conda install ipython pandas seaborn numexpr cython scikit-learn traits

Now the pip libraries:

pip install formencode sqlalchemy sqlalchemy-migrate genshi cfflib nibabel apscheduler mod_pywebsocket minixsv BeautifulSoup mplh5canvas

...to which I have the following additions:

pip install nilearn nipype

Note on mayavi:

There is an open question I think about whether to try and install + use mayavi. Mayavi is needed for some TVB visualization routines, and you get annoying error messages if you don't have it installed when you do some basic imports. I have looked into this here a little, in part because of TVB and in part because I would like to use Pysurfer.

In a nutshell: conda manages to install mayavi ok (no mean feat; pip almost invariables fails at this without some help in my experience on other systems).

However if this is done after installing other libraries, it appears to want to change the numpy version, which then breaks the tvb installation. So I think the best way to do this, if desired, is to do conda install mayavi at the very start. This would actually obviate the need for a number of the installation calls listed above, as conda will install them along with mayavi.

However, since the above definitely works, and the mayavi option hasn't been fully tested, and moreover seeing as I'm not sure if we can actually do proper mayavi visualizations on the grid compute nodes anyway, I will leave this just as a note for now.

Download and install TVB

Now we grab the tvb repositories from github. The only one we need to actually install is tvb-geodesic; and that is only essential if we are doing surface simulations.

First, might need to install git, and also gcc

yum install git gcc gcc-c++

Now the TVB stuff:

mkdir /usr/local/tvb_scientific

cd /usr/local/tvb_scientific

git clone https://github.com/the-virtual-brain/tvb-geodesic 

git clone https://github.com/the-virtual-brain/tvb-library 

git clone https://github.com/the-virtual-brain/tvb-data 

git clone https://github.com/the-virtual-brain/tvb-framework 

git clone https://github.com/the-virtual-brain/tvb-documentation 

cd tvb-geodesic

python setup.py install

cd ~/

Test TVB

To use TVB we first need to add the libraries to the PYTHONPATH. This can either be done in the bash shell

for d in  tvb-library tvb-framework tvb-data tvb-documentation;
do
  export PYTHONPATH=/usr/local/tvb_scientific/$d:$PYTHONPATH
done

Or alternatively do it from inside python:

import sys
tvb_path = '/usr/local/tvb_scientific/'
tvb_repos = ['tvb-library', 'tvb-framework', 'tvb-data', 'tvb-documentation']
addpaths = [tvb_path + r for r in tvb_repos]
sys.path.append(addpaths)

Now TVB should be ready to use.

We can do some very quick checks of this from a python or ipython shell:

ipython

from tvb.simulator.lab import *

An extremely minimal test simulation script would look like the following (download from here ):

# Minimal TVB Test Simulation 
# ===========================
#
# JG 24/11/2015

print '\n\nRunning minimal tvb test simulation...'

# Imports
print '\n\nImporting libraries...'
from tvb.simulator.lab import *

# Set up model
print '\n\nSetting up model...'
model = models.Generic2dOscillator()
conn = connectivity.Connectivity(load_default=True)
cpl = coupling.Linear()
heunint = integrators.HeunDeterministic()
mons = (monitors.Raw(),monitors.TemporalAverage())
sim = simulator.Simulator(model=model,connectivity=conn,
                          coupling=cpl,integrator=heunint,
                          monitors=mons)
sim.configure()

# Run sim
print '\n\nRunning simulation...'
raw,tavg = sim.run(simulation_length=2**6)

print '\n\nTest simulation completed successfully.'
print 'Results can be found in "raw" and "tavg" variables'

Programatically creating simulation scripts

Let's do this with the parameter space exploration (PSE) with the generic 2D oscillator model described in the GUI tutorials.

# to do...

Submitting jobs to the cluster

# to do...

Re-assembling job outputs

misc TVB