The Exelixis Lab

Enabling Research in Evolutionary Biology

How to install & run RAxML on a typical cluster by Pavlos Pavlidis

Here are some Instructions for installing and executing RAxML v7.2.8  on a typical cluster system. We will outline this by example of the LRZ linux cluster located at the Leibniz Rechenzentrum in Munich. This is just an example of a typical cluster system.

Step 1: Get & Compile RAxML

  • Download the latest RAxML version from here to your local computer (laptop/workstation). You will be asked to save the file on your hard drive. Let's assume that  you have saved it in your home directory.
  • Wait until the file has been downloaded. This should take a couple of seconds.
  • Open a terminal (it's called Konsole in KDE systems such as Kubuntu). 
  • Go to the directory where you have saved (e.g., your home directory). If you have saved it in the home directory, just type cd and then press Enter in the terminal. 
  • Copy the file to your cluster account (e.g. the home directoty of your LRZ-account). Let's assume that your user account name (login name) is aa111aa. Just type in your terminal  scp (Note the : at the end of the command). Once you have typed this and pressed enter, you will be asked for your password to be allowed to login to the cluster system.
  • Now to login to your account at LRZ (or any other cluster for that matter), open a terminal (assuming that your user account name is aa111aa again) and type: ssh This will take you to your home directory at the LRZ cluster, i.e., you know have opened a remote terminal on the cluster system and cann issue commands to this remote computer.
  • Initially you will now need to unpack/uncompress To do this type: unzip
  • After those two commands a folder named standard-RAxML-master will have been created. 
  • Change directory to the standard-RAxML-master directory by typing: cd standard-RAxML-master
  • In this directory called standard-RAxML-master,  you will find the files that make up the source code, that is, a high level, abstract description of the computations that RAxML needs to compute that will initially need to be compiled (translated) into more low level code that can directly be interpreted by the processor. To start with, make sure that, there are no .o files (object files) in your RAxML directory by typing ls *.o . If there are some *.o files just delete them by typing rm *.o 
  • This step here is optional. Using your favorite editor (e.g., emacs or vi or nano) open the file called Makefile.SS3.gcc and replace in the first line "gcc" by "icc". This will make  RAxML to be compiled using the Intel icc compiler (translator) instead of the GNU gcc compiler which on such system usually produces (sometimes substantially) slower code. Save the file and exit the editor.
  • Now you can actually compile (translate) the open source code into a machine readable and executable program by typing: make -f Makefile.SS3.gcc This process will generate an executable called raxmlHPC-SSE3
  • Copy the executable raxmlHPC-SSE3 in the bin directory which is located in your Home directory. (Please make sure that the bin directory is in the PATH). On Linux systems the bin directory contains all executable programs.

Step 2: Create queue scripts to submit RAxML jobs to the queuing system and run RAxML

  • Let's assume that you will perform the analysis in a directory called RAxMLanal where your data may be located/copied to. Initially, you will need to create this folder/directory by typing mkdir RAxMLanal
  • then change to the directory by cd RAxMLanal
  • Now, assume that your alignment is called myalign.phy and this is located inside the RAxMLanal (you can copy this from your desktop to the cluster system by using the scp command again, e.g.: scp myalign.phy
  • Thereafter, you will need to create the following script (as plain text file using a text editor such as nano or vi) to compute a ML tree with RAxML, or rather to tell the job queuing system of the cluster that you want to compute something. Some specific commands are installtion-/cluster-specific, but most scripts look alike, in case of doubt, ask your local geeks. This is how such a script for the LRZ system would look like:
    #$-S /bin/bash
    #$-o $HOME/RAxMLanal/RAxMLanal.out
    #$-l mf=10000M
    #$-l march=x86_64
    cd $HOME/RAxMLanal
    . /etc/profile
    $HOME/bin/raxmlHPC-SSE3 -N 10 -m GTRCAT -s myalign.phy -n myalign.phy pause

    Please save the commands/text within the lines --------------------------- in a script called
    NOTE 1: You will evidently have to adapt the lines containing YOUR in the above script according to your personal settings/account name
    NOTe 2: You will have to adapt the specific command line parameters after raxmlHPC-SSE3 to your needs. The command above will just compute 10 ML trees using the CAT approximation of rate heterogeneity on randomized stepwise addition order parsimony trees.
  • Now, finally you can submit the job as specified in the job script called to the cluster system by typing qsub
  • Note that, depending on the cluster load (if other people are doing useless computations ;-) ) your job will not be executed immediatly, you can monitor this by using, e.g., the qstat command
  • You will be notified by email once your job has completed
  • For questions/suggestions regarding the usage of RAxML on the LRZ cluster please contact Pavlos dot Pavlidis at h hyphen its dot org 
  • For general suggestions send an email to raxml at h hyphen its dot org
  • Many Thanks to Isabella Stoeger who contacted us about this problem and also provided the above example job script.