MrBayes on XSEDE (now ACCESS)

MrBayes is a program for the Bayesian estimation of phylogeny. Bayesian inference of phylogeny is based upon a quantity called the posterior probability distribution of trees, which is the probability of a tree conditioned on the observations. The conditioning is accomplished using Bayes' theorem. The posterior probability distribution of trees is impossible to calculate analytically; instead, MrBayes uses a simulation technique called Markov chain Monte Carlo (or MCMC) to approximate the posterior probabilities of trees.

MrBayes 3.2.7 provides new features. It supports commands for BEST (Bayesian Estimation of Species Trees). It also supports checkpointing, which makes it possible to restart a run that has terminated unexpectedly, or that has reached the end of the maximum allowed run time without converging. While MrBayes 3.2.7 code supports the use of GPUs via BEAGLE, we do not currently support that option. Nevetheless, the new code at CIPRES offers signficant speed ups.

MrBayes 3.2.7 provides an interface that allows one to configure and submit jobs to Expanse, a large NSF Resource. The interface also supports submissions for jobs that are configured via a MrBayes Block in the Nexus input file. We always recommend use of the Nexus file for job configuration, because it seems much simpler to manage.

Setting nruns= and nchains= for MrBayes

The reason for doing two "runs" is to see whether they obtain similar consensus trees and parameters.  Making at least two runs is always recommended.  The interface permits nruns=2 and nruns=4. Running odd numbers is inefficient AND slows your run down.

The MrBayes manual says: "By default, nchains is set to 4, meaning that MrBayes will use 3 heated chains and one "cold" chain. In our experience, heating is essential for problems with more than about 50 taxa, whereas smaller problems often can be analyzed successfully without heating. Adding more than three heated chains may be helpful in analyzing large and difficult data sets."

Based on this information, we recommend that users set nruns=2 and nchains=4. Further, the interface currently limits nruns*nchains to be <= 16. nruns*nchains must be a multiple of 2.

IF YOU ARE MAKING A BIG RUN (large number of generations): Let's say you want to run 100,000,000 generations and your data set is large. By default, samplefreq=500. This will likely create huge output files. You want to set samplefreq so the job will not exceed the maximum size CIPRES allows (8 GB). If you arent sure, we recommend you run for a few thousand generations, and monitor the size of the output files using the intermedaite files link. If your files seem to grow very quickly, just stop the job, and edit the input file, setting samplefreq= to a larger value.

IF YOU WANT TO RESTART MRBAYES: A key issue is to make sure that your checkfreq and samplefreq values are compatible. checkfreq= command in your Nexus file sets the number of generations between writing to the .ckp file. This value should always be less than or equal to samplefreq= . Default for checkfreq is 2000; default for samplefreq=500. Note that checkfreq must also be an even multiple of samplefreq. If you set samplefreq for 10000 in the mrbayes  block of your input file, say for a long run, you should set samplefreq for an even multiple, 1000 or 5000. samplefreq=3000 or samplefreq=7000 will not work for obvious reasons. 

Manual for MrBayes 3.2.7: here

MrBayes mail list:

MRBAYES home page here.

INPUT = dna or protein matrices in Nexus format

Simple Example of Run Input/Output

Input File Type File Name
input file infile.nex
example MB block mbblock.nex
Output File Type File Name
log file log.txt and mrbayeslog.out
sump_output infile.nex.run1.p
sumt_output infile.nex.run1.t
all_mcmc_trees infile.nex.trprobs
partition information
consensus_tree infile.nex.con
acceptance_ratios infile.nex.mcmc

Known Issues:

If you use MrBayes here, please cite: Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012 May;61(3):539-42. doi: 10.1093/sysbio/sys029. Epub 2012 Feb 22. PMID: 22357727; PMCID: PMC3329765.

If there is a tool or a feature you need, please let us know.

hummingbird in flight

Get 1000 Hours free

On the UCSD Supercomputer

Start Your Trial