BigJob, a SAGA-based Pilot-Job, is a general purpose Pilot-Job framework. Pilot-Jobs support the use of container jobs with sophisticated workflow management to coordinate the launch and interaction of actual computational tasks within the container. This results in the decoupling of workload submission from resource assignment, allowing a flexible execution model that enables the distributed scale-out of applications on multiple and possibly heterogeneous resources. It allows the execution of jobs without the necessity to queue each individual job.
In order to proceed with this tutorial, you must install BigJob. Installation instructions are listed below.
This section will explain how to set up your environment and install BigJob.
Assuming you don’t want to mess with your system Python installation, you need a place where you can install BigJob locally. A small tool called virtualenv allows you to create a local Python software repository that behaves exactly like the global Python repository, with the only difference that you have write access to it. This is referred to as a ‘virtual environment.’
To create your local Python environment run the following command (you can install virtualenv on most systems via apt-get or yum, etc.):
virtualenv $HOME/.bigjob/python
If you don’t have virtualenv installed and you don’t have root access to your machine, you can use the following script instead:
curl --insecure -s https://raw.github.com/pypa/virtualenv/1.9.X/virtualenv.py | python - $HOME/.bigjob/python
You need to activate your Python environment in order to make it work. Run the command below. It will temporarily modify your PYTHONPATH so that it points to $HOME/.bigjob/python/lib/python2.7/site-packages/ instead of the the system site-package directory:
source $HOME/.bigjob/python/bin/activate
Activating the virtualenv is very important. If you don’t activate your virtual Python environment, the rest of this tutorial will not work. You can usually tell that your environment is activated properly if your bash command-line prompt starts with (python).
The last step in this process is to add your newly created virtualenv to your .bashrc so that any batch jobs that you submit have the same Python environment as you have on your submitting resource. Add the following line at the end of your $HOME/.bashrc file:
source $HOME/.bigjob/python/bin/activate
After your virtual environment is active, you are ready to install BigJob. BigJob is available via PyPi and can be installed using pip as follows:
pip install bigjob
You can change the default installation directory by calling:
pip install --install-option="--prefix=<target-directory>" bigjob
To make sure that your installation works, run the following command to check if the BigJob module can be imported by the python interpreter:
python -c "import pilot; print pilot.version"
There are two requirements for proper BigJob execution:
BigJob needs a working directory in which to store all of its output, run information, and any errors that may occur. This directory can be named anything you choose, but for any examples in this manual, we will call the directory ‘agent’ (default). You should create this directory in the same location you run your scripts from, i.e. usually $SCRATCH or $WORK. You can create this directory by typing:
mkdir agent
If you are planning to submit from one resource to another, you must have SSH password-less login enabled to the submitting resource. This is achieved by placing your public key on one resource in the authorized_keys file on the target machine.
Examples of when you would need password-less login:
Prerequisites
Key Generation and Installation
First, you have to generate a key. You do this as follows:
Example:
ssh-keygen -t rsa -C johndoe@email.edu
Generating public/private rsa key pair.
Enter file in which to save the key (/home/johndoe/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/johndoe/.ssh/id_rsa.
Your public key has been saved in /home/johndoe/.ssh/id_rsa.pub.
The key fingerprint is: 34:87:67:ea:c2:49:ee:c2:81:d2:10:84:b1:3e:05:59 johndoe@email.edu
You can find your key under the key location. As we used the .ssh directory, it will be located there.:
cd /home/username/.ssh
ls
Verify that you have created the files id_rsa and id_rsa.pub.
Use a text editor to open the id_rsa.pub file. Copy the entire contents of this file.
The contents of this file needs to be appended to the target machine’s .ssh/authorized_keys file. If the authorized_keys file is not accessible, then just create a .ssh/authorized_keys2 file and paste the key.
Now the target machine is ready to accept your ssh key.
The ssh-add command tells the machine which keys to use. For a test, type:
ssh-agent sh -c 'ssh-add < /dev/null && bash'
This will start the ssh-agent, add your default identity (prompting you for your passphrase), and spawn a bash shell.
From this new shell, you should be able to ssh target_machine. This should let you in without typing a password or passphrase.
Test whether you have a password-less login to the target machine by executing the simple command:
ssh <hostname> /bin/date
This command should execute without password input.
BigJob uses a Redis server for coordination and task management. Redis is the most stable and fastest backend (requires Python >2.5) and the recommended way of using BigJob. BigJob will not work without a coordination backend.
Redis can easily be run in user space. For additional information about redis, please visit the website, `redis<http://www.redis.io>`_. To proceed with the tutorial, please take the following steps:
wget http://download.redis.io/redis-stable.tar.gz
tar xvzf redis-stable.tar.gz
cd redis-stable
make
Once you have downloaded and installed it, start a Redis server on the machine of your choice as follows:
$ cd redis-stable
$ ./src/redis-server
[489] 13 Sep 10:11:28 # Warning: no config file specified, using the default config. In order to specify a config file use 'redis-server /path/to/redis.conf'
[489] 13 Sep 10:11:28 * Server started, Redis version 2.2.12
[489] 13 Sep 10:11:28 * The server is now ready to accept connections on port 6379
[489] 13 Sep 10:11:28 - 0 clients connected (0 slaves), 922160 bytes in use
You can install redis on a persistent server and use this server as your dedicated coordination server.