How to install & run RAxML on a typical cluster by Pavlos Pavlidis
Here
are some Instructions for installing and executing RAxML v7.2.8
on a typical cluster system. We will outline this by example of the LRZ
linux cluster located at the Leibniz Rechenzentrum in Munich. This is
just an example of a typical cluster system.
Step 1: Get & Compile RAxML
- Download the latest RAxML version from here
to your local computer (laptop/workstation). You will be asked to save
the file standard-RAxML-master.zip on your hard drive. Let's assume
that you have saved it in your home directory.
- Wait until the file has been downloaded. This should take a couple of seconds.
- Open a terminal (it's called Konsole in KDE systems such as Kubuntu).
- Go to the directory where you have saved standard-RAxML-master.zip
(e.g., your home directory). If you have saved it in the home directory,
just type cd and then press Enter in the terminal.
- Copy
the file
standard-RAxML-master.zip to your cluster account (e.g. the home
directoty of your LRZ-account). Let's assume that your user
account name (login name) is aa111aa. Just type
in your terminal scp standard-RAxML-master.zip
aa111aa@lx64ia2.lrz-muenchen.de: (Note the
: at the end of the command).
Once you have typed this and pressed enter, you will be asked for your
password to be allowed to login to the cluster system.
- Now to login to your account at LRZ (or any other cluster for that matter), open a terminal (assuming that your
user account name is aa111aa again) and type: ssh
aa111aa@lx64ia2.lrz-muenchen.de.
This will take you to your home
directory at the LRZ cluster, i.e., you know have opened a remote
terminal on the cluster system and cann issue commands to this remote
computer.
- Initially you will now need to unpack/uncompress standard-RAxML-master.zip. To do this type:
unzip standard-RAxML-master.zip
- After those two commands a folder named standard-RAxML-master will have been created.
- Change directory to the standard-RAxML-master directory by typing:
cd standard-RAxML-master
- In
this directory called standard-RAxML-master, you will find the files
that make up the source code, that is, a high level,
abstract description of the computations that RAxML needs to
compute that will initially need to be compiled (translated) into more
low level code that can directly be interpreted by the processor. To
start with, make sure that, there are no .o files (object files) in
your RAxML directory by typing ls *.o .
If there are some *.o files just delete them by typing rm *.o
- This
step here is optional. Using your favorite editor (e.g., emacs or vi or
nano) open
the file called Makefile.SS3.gcc and replace in the first line "gcc" by
"icc". This will make RAxML to be compiled using the Intel icc
compiler (translator) instead of the GNU gcc compiler which on such
system usually produces (sometimes substantially) slower code. Save
the file and exit the editor.
- Now you can actually compile (translate) the open source code into a machine readable and executable program by typing: make -f Makefile.SS3.gcc
This process will generate an executable called raxmlHPC-SSE3
- Copy the executable raxmlHPC-SSE3 in the bin directory which is located
in your Home directory. (Please make sure that the bin directory is in
the PATH). On Linux systems the bin directory contains all executable programs.
Step 2: Create queue scripts to submit RAxML jobs to the queuing system and run RAxML
- Let's
assume that you will perform the analysis in a directory called
RAxMLanal where your data may be located/copied to. Initially, you will
need to create this folder/directory by typing mkdir RAxMLanal
- then change to the directory by cd RAxMLanal
- Now,
assume that your alignment is called myalign.phy and this is located
inside the RAxMLanal (you can copy this from your desktop to the
cluster system by using the scp command again, e.g.: scp myalign.phy aa111aa@lx64ia2.lrz-muenchen.de:~aa111aa/RAxMLanal/
- Thereafter,
you will need to create the following script (as plain text file using
a text editor such as nano or vi) to compute a ML tree with RAxML, or
rather to tell the job queuing system of the cluster that you want to
compute something. Some specific commands are
installtion-/cluster-specific, but most scripts look alike, in case of
doubt, ask your local geeks. This is how such a script for the LRZ
system would look like:
------------------------------------------------
#!/bin/bash
#$-M YOUR EMAIL
#$-S /bin/bash
#$-N YOUR JOB NAME
#$-o $HOME/RAxMLanal/RAxMLanal.out
#$-l mf=10000M
#$-l march=x86_64
cd $HOME/RAxMLanal
.
/etc/profile
$HOME/bin/raxmlHPC-SSE3 -N 10 -m GTRCAT -s myalign.phy -n myalign.phy pause
-----------------------------------------------
Please save
the commands/text within the lines --------------------------- in a script called submit.sh
NOTE
1: You will evidently have to adapt the lines containing YOUR in the
above script according to your personal settings/account name
NOTe
2: You will have to adapt the specific command line parameters after
raxmlHPC-SSE3 to your needs. The command above will just compute 10 ML
trees using the CAT approximation of rate heterogeneity on randomized
stepwise addition order parsimony trees.
- Now, finally you can submit the job as specified in the job script called submit.sh to the cluster system by typing qsub submit.sh
- Note
that, depending on the cluster load (if other people are doing useless
computations ;-) ) your job will not be executed immediatly, you can
monitor this by using, e.g., the qstat command
- You will be notified by email once your job has completed
- For
questions/suggestions regarding the usage of RAxML on the LRZ cluster
please contact Pavlos dot Pavlidis at h hyphen its dot org
- For general suggestions send an email to raxml at h hyphen its dot org
- Many Thanks to Isabella Stoeger who contacted us about this problem and also provided the above example job script.