Dr. Alexandros Stamatakis

Postdoctoral Researcher

SOFTWARE

RAxML at the SC08 CLUSTER CHALLENGE: 

RAxML is one of the test applications at the IEEE/ACM 2008 Supercomputing conference cluster challenge (see here for more information on the SC cluster challenge). Following various requests by students of participating Universities we have put together some information here:

Michael Ott has kindly made available some of the test datasets we used to assess performance in our SC07 paper. They are available here.

The publications frame of my homepage provides some useful background reading. You might have a look at the PDF of my PhD thesis that will hopefully give a general introduction to the topic for non-specialists. Another interesting paper to read within this context might be:

Alexandros Stamatakis, Michael Ott, and Thomas Ludwig:``RAxML-OMP: An Efficient Program for Phylogenetic Inference on SMPs''. In Proceedings of 8th International Conference on Parallel Computing Technologies (PaCT2005), Volume  3606 of Lecture Notes in Computer Science, 288-302, Springer Verlag, September 2005.

Finally there is also a more recent paper on exploiting the loop level parallelism of the phylogenetic Maximum Likelihood function with Pthreads, OpenMP, and MPI in RAxML on various multi-core, cluster, and shared-memory supercomputers, the preprint is available here. Note that the MPI-based parallelization of the likelihood function described in this paper as well as the SC07 paper is still not available in the official release below, but we are working on it and it should become available in 1-2 months. It will also be a good idea to look at the PDF of the RAxML manual below.

UPDATED easyRax VERSION:

Guy Leonard at the Centre for Eukaryotic Evolutionary Microbiology of the University of Exeter has put together an updated version of easyRax (updated on July 14, 2008, thanks Guy :-) ), a wrapper script for RAxML which aims to facilitate the usage of the program for non command-line gurus. The script is available here or via Guy’s homepage.

UPDATED RAxML VERSION:

RAxML 7.0.4 (source code) and a comprehensive Manual (v7.0.4)

Please also have a look at the updated RAxML web-server information below. The server at SDSC is already running RAxML 7.0.3, while the Vital-IT server still uses an intermediate version. Both servers will be updated to version 7.0.4 soon.

New Features (version 7.0.4):

· Possibility to run rapid BS algorithm with constraint trees (-r and –g options)

· Added taxon-name error checking

· Increased allowed taxon name length to 256 characters

· Added option to compute pair-wise ML-distances between taxa

 

 Pre-compiled executables below are still for version 7.0.3! To be updated soon….

Windows executable  the previous problems with the Windows version should now hopefully be resolved. Graham Jones has provided a nice PDF on How to run RAxML under XP and Vista.

Mac executable (iMAC)

Mac executable (iMAC Pthreads-version)

Mac executable (PowerMac G5)

Mac executable (PowerMac G5 Pthreads-version)

 

SCRIPTS:

Here is a little perl-script that will automatically determine the best-scoring AA substitution model on a fixed starting tree.  Note that raxmlHPC must be in your $PATH for this to work.

· For unpartitioned datasets execute it like this: perl ProteinModelSelection.pl alignmentFile.phylip > outfile The outfile will then contain the best-scoring AA model to use with RAxML.

· For partitioned datasets execute it like this: perl ProteinModelSelection.pl alignmentFile.phylip partitionData.txt > outfile The outfile will then contain the best-scoring AA model for every partition.

Here is a little script perl-script that will convert FASTA files to RAxML’s relaxed PHYLIP format, i.e., not truncate sequence names. Call it e.g. perl fasta2relaxedPhylip.pl alignment.fasta and it will generate a file alignment.fasta.phylip that can be read by RAxML.

 

IMPROVED SERVICE: RAxML Web-Servers

Two RAxML web-servers that use a novel rapid bootstrapping algorithm are now freely available at the Vital-IT Unit of the Swiss Institute of Bioinformatics and the CIPRES cluster at the San Diego Supercomputer Center. Note that, the email notification  service now works and that you can submit alignments of (nearly) arbitrary size, i.e., the run-time restriction to 24 hours has been lifted.

In addition, the Web-Server at Vital-IT now features a text window where you can paste in your alignment. This is intended for use in combination with the myHits tool at Vital-IT. myHits is a free database devoted to protein domains. It is also a collection of tools for the investigation of the relationships between protein sequences and motifs described on them. These motifs are defined by an heterogeneous collection of predictors, which currently includes regular expressions, generalized profiles and hidden Markov models.

NEW SOFTWARE: AxParafit & AxPcoords

Together with Markus Göker, Alexander Auch and Jan Meier-Kolthoff at the University of Tübingen I have developed a significantly accelerated and parallelized version of Pierre Legendre’s Parafit program for tests of host-parasite coevolution. The sequential version is up to 67 times faster than Parafit and in combination with the parallel version we were able to conduct the largest co-evolutionary study to date within 8 minutes as opposed to about 4 weeks with Parafit. We provide the program as open source code and pre-compiled binaries.

RAxML COMMUNITY CONTRIBUTIONS

· Guy Leonard at the Centre for Eukaryotic Evolutionary Microbiology of the University of Exeter has put together easyRax, a wrapper script for RAxML which aims to facilitate the usage of the program for non command-line gurus. The script is available here or via Guy’s homepage

· RAxML and AxParafit are currently being integrated into the Debian-Med distribution (Quote: “The goal of Debian-Med is a complete system for all tasks in medical care which is built completely on free software”).

· Guide to install RAxML on MACs by James Munro (munroj01_at_student_ucr_edu). James Munro at the Department of Entomology of the University of California at Riverside has written this very helpful guide for installing RAxML on MACs. It has been written by a Biologist for non-computer experts.

· batchRAxML.pl by Olaf Bininda-Emonds (olaf_bininda_at_uni-jena_de). This nice script by my good colleague from Munich times Olaf Bininda-Emonds provides a wrapper around RAxML to easily analyze a set of data files according to a common set of the search criteria. Also organizes the RAxML output into a set of subdirectories.

· PYRAXML2 by Frank Kauff (fkauff_at_rhrk_uni-kl_de). Frank Kauff at University of Kaiserslautern (formerly at Duke University) has written this cool script that reads NEXUS-style data files and prepares the necessary input files and command-line options for RAxML-VI-HPC. You can download the BETA-version here: PYRAXML2 It requires PYTHON and BIOPYTHON to be installed on your computer.

OBSOLETE VERSIONS & SOFTWARE

· RAxML-VI-HPC (version 2.2.3) and a comprehensive Manual (v2.2.3) 

· Windows Executable for RAxML-VI-HPC (version 2.2.3)  I am extremely grateful to Graham Jones who is a free-lance Computer Scientist in the U.K. He ported and compiled this Windows version of RAxML.

· MAC Executable for RAxML-VI-HPC (version 2.2.3). Dave Carmean (carmean_at_sfu_ca) at Simon Fraser University has kindly put together this RAxML executable for MACs. He has also set up a web-page Installing and running RAxML on a Mac in less than a minute...

· RAxML-VI-HPC (version 2.0.2) and a comprehensive Manual (v2.0)

· RAxML-VI-HPC (version 1.0) and a comprehensive Manual (v1.0)

· RAxML-VI: Sequential program with significantly accelerated hill-climbing search algorithm for huge alignment data

· PERL script for non-parametric bootstrapping with RAxML-VI. Note that—depending on your installation—you might have to replace “./raxml” by “raxml” in this script.

· RAxML-VI DOS executables for Windows. Those executables have kindly been provided by Jarno Tuimala (jtuimala_at_csc.fi, CSC Finland), whom I would like to thank  for his help and valuable comments.

· RAxML-III:  Sequential program, includes more models of nucleotide substitution than RAxML-II.

· RAxML-II: Sequential, Parallel, and Distributed implementation of RAxML with less model functionality

· The old Alignment Benchmark set: includes some large real-world alignments and best-known trees for those alignments

· Phylogenetic Visualization tool using treemaps & taxonomic information. The screenshot below was taken from the visualization of a phylogenetic tree containing 2415 mammalian sequences.