GARLI is run by two pre-created XML file in the CIPRES Portal:
The CIPRES portal currently runs GARLI on Protein or DNA/RNA sequences
You can submit using GTR +I + Gamma or GTR + GAMMA models.
The command line for Garli GTR + Gamma differs by a single line:
<commands>invariantsites = estimate;</commands> for GTR+I +Gamma is replaced by
<commands>invariantsites = none ;</commands> for GTR+Gamma
XML command file structure for Garli: +I + Gamma is below
<xml version="1.0" encoding="UTF-8"?>
<command-object>
<command-target>
<reference>
<registry-info>cipres0,IDL:CipresIDL_api1/TreeImprove:1.0,GARLI,Python wrapper (by Mark Holder) around Derrick Zwickl's GARLI (version 0.951).</registry-info>
</reference>
<ui-id>GARLI</ui-id>
</command-target>
<commands>randseed = -1 ;</commands>
<commands>availablememory = 2560 ;</commands>
<commands>ignoreStartEdgeLen=1 ;</commands>
<commands>refinestart = 1 ;</commands>
<commands>enforcetermconditions = 1 ;</commands>
<commands>genthreshfortopoterm = 10000 ;</commands>
<commands>significanttopochange = 0.01 ;</commands>
<commands>scorethreshforterm = 0.05 ;</commands>
<commands>ratematrix = 6rate ;</commands>
<commands>statefrequencies = estimate ;</commands>
<commands>ratehetmodel = gamma ;</commands>
<commands>invariantsites = estimate ;</commands>
<commands>nindivs = 4 ;</commands>
<commands>holdover = 1 ;</commands>
<commands>selectionintensity = 0.5 ;</commands>
<commands>holdoverpenalty = 0 ;</commands>
<commands>topoweight = 1.0 ;</commands>
<commands>randnniweight = 0.1 ;</commands>
<commands>randsprweight = 0.3 ;</commands>
<commands>limsprweight = 0.6 ;</commands>
<commands>limsprrange = 6 ;</commands>
<commands>uniqueswapbias = 0.01 ;</commands>
<commands>distanceswapbias = 1.0 ;</commands>
<commands>modweight = 0.05 ;</commands>
<commands>gammashapemodel = 1000 ;</commands>
<commands>brlenweight = 0.2 ;</commands>
<commands>meanbrlenmuts = 5 ;</commands>
<commands>gammashapebrlen = 1000 ;</commands>
<commands>intervallength = 100 ;</commands>
<commands>intervalstostore = 5 ;</commands>
<commands>startoptprec = 0.5 ;</commands>
<commands>minoptprec = 0.01 ;</commands>
<commands>numberofprecreductions = 10 ;</commands>
<commands>treerejectionthreshold = 50 ;</commands>
</command-object>
Each line in the XML files sets a flag for the program. Each of the flags is described below, and more details are provided in the Manual [pdf].
<commands>randseed = -1 ;</commands>
randseed – The random number seed used by the random number generator. Specify –1 to have a seed chosen for you. Specifying the same seed number in multiple runs will result in exactly identical runs, if all other parameters are also identical.
<commands>availablememory = 2560 ;</commands>
availablememory – Typically this is the amount of available physical memory on the system, in megabytes. This lets GARLI determine how much system memory it may be able to use to store computations for reuse.
<commands>ignoreStartEdgeLen=1 ;</commands>
ignoreStartEdgeLen=(1, 0) this command specifies whether the startgin tree branch lengths are kept (1) or ignored (0).
<commands>refinestart = 1 ;</commands>
refinestart (0 or 1, 1) – Specifies whether some initial rough optimization is performed on the starting branch lengths and alpha parameter. This is always recommended.
<commands>enforcetermconditions = 1 ;</commands>
enforcetermconditions (0 or 1, 1) – Specifies whether the automatic termination conditions will be used. The conditions specified by both of the following two parameters must be met. See the following two parameters for their definitions. If this is false, the run will continue until it reaches the time (stoptime) or generation (stopgen) limit. It is highly recommended that this option is used!
<commands>genthreshfortopoterm = 10000 ;</commands>
genthreshfortopoterm (0 to infinity, 10,000) – This specifies the first part of the termination condition. When no new significantly better scoring topology (see significanttopochange below) has been encountered in greater than this number of generations, this condition is met. Increasing this parameter may improve the lnL scores obtained (especially on large datasets), but will also increase runtimes.
<commands>significanttopochange = 0.01 ;</commands>
significanttopochange (0 to infinity, 0.01) – The lnL increase required for a new topology to be considered significant as far as the termination condition is concerned. This was fixed at 0.01 in version 0.93, but is now controllable. It probably doesn’t need to be played with, but you might try increasing it slightly if your runs reach a stable score and then take a very long time to terminate due to very minor changes in topology.
<commands>scorethreshforterm = 0.05 ;</commands>
scorethreshforterm (0 to infinity, 0.05) – The second part of the termination condition. When the total improvement in score over the last intervallength x intervalstostore generations GARLI (see below) is less than this value, this condition is met. This does not usually need to be changed.
<commands>ratematrix = 6rate ;</commands>
ratematrix (1rate, 2rate, 6rate, fixed) – The number of relative substitution rates estimated. Equivalent to the “nst” setting in PAUP* and MrBayes. 1rate assumes that substitutions between all pairs of nucleotides occur at the same rate, 2rate allows different rates for transitions and transversions, and 6rate allows a different rate between each nucleotide pair. These rates are estimated unless the fixed option is chosen.
<commands>statefrequencies = estimate ;</commands>
statefrequencies (equal, empirical, estimate, fixed) – Specifies how the equilibrium state frequencies (of A, C, G and T) are treated. The empirical setting fixes the frequencies at their observed proportions, and the other options should be self-explanatory.
<commands>ratehetmodel = gamma ;</commands>
ratehetmodel (none, gamma, gammafixed) – The model of rate heterogeneity assumed. gammafixed requires that the alpha shape parameter is provided, and gamma estimates it.
<commands>invariantsites = estimate ;</commands>
invariantsites (none, estimate, fixed) – Specifies whether a parameter representing the proportion of sites that are unable to change will be included. (note that this option replaces the previous dontinferproportioninvariant option of version 0.94)
<commands>nindivs = 4 ;</commands>
nindivs (2 to 100, 4)-The number of individuals in the population. This may be increased, but generally seems to slow the rate of score increase.
<commands>holdover = 1 ;</commands>
holdover (1 to nindivs-1, 1) -The number of times the best individual is copied to the next generation with no chance of mutation. It is best not to mess with this.
<commands>selectionintensity = 0.5 ;</commands>
selectionintensity (0.01 to 5.0, 0.5) - Controls the strength of selection, with larger numbers denoting stronger selection. The relative probability of reproduction of two individuals depends on the difference in their log likelihoods (ΔlnL) and is formulated very similarly to the procedure of calculating Akaike weights.
<commands>holdoverpenalty = 0 ;</commands>
holdoverpenalty – (0 to 100, 0) This can be used to bias the probability of reproduction of the best individual downward. Because the best individual is automatically copied into the next generation, it has a bit of an unfair advantage and can cause all population variation to be lost due to drift, especially with small populations sizes. The value specified here is subtracted from the best individual’s lnL score before calculating the probabilities of reproduction. It seems plausible that this might help maintain variation, but I have not seen it cause a measurable effect.
<commands>topoweight = 1.0 ;</commands>
topoweight (0 to infinity, 1.0) The prior weight assigned to the class of topology mutations (NNI, SPR and limSPR). modweight (0 to infinity, 0.05) The prior weight assigned to the class of model mutations. Note that setting this at 0.0 fixes the model during the run. brlenweight ((0 to infinity, 0.2) The prior weight assigned to branch-length mutations.
<commands>randnniweight = 0.1 ;</commands>
randnniweight (0 to infinity, 0.1) -The prior weight assigned to NNI mutations.
<commands>randsprweight = 0.3 ;</commands>
randsprweight (0 to infinity, 0.3) -The prior weight assigned to random SPR mutations. For very large datasets it is often best to set this to 0.0, as random SPR mutations essentially never result in score increases.
<commands>limsprweight = 0.6 ;</commands>
limsprweight (0 to infinity, 0.6) -The prior weight assigned to SPR mutations with the reconnection branch limited to being a maximum of limsprrange branches away from where the branch was detached.
<commands>limsprrange = 6 ;</commands>
limsprrange (0 to infinity, 6) – The maximum number of branches away from its original location that a branch may be reattached during a limited SPR move. Setting this too high (> 10) can seriously degrade performance.
<commands>uniqueswapbias = 0.01 ;</commands>
uniqueswapbias (0.01 to 1.0, 0.1) – In version 0.95, GARLI now keeps track of which branch swaps it has attempted on the current best tree. Because swaps are applied randomly, it is possible that some swaps are tried twice before others are tried at all. This option allows the program to bias the swaps applied toward those that have not yet been attempted. Each swap is assigned a relative weight depending on the number of times that it has been attempted on the current best tree. This weight is equal to (uniqueswapbias) raised to the (# times swap attempted) power. In other words, a value of 0.5 means that swaps that have already been tried once will be half as likely as those not yet attempted, swaps attempted twice will be ¼ as likely, etc. A value of 1.0 means no biasing. If this value is not equal to 1.0 and the outputmostlyuseless files option is on, a file called <ofprefix>.swap.log is output. This file shows the total number rearrangements tried and the number of unique ones over the course of a run. Note that this bias is only applied to NNI and limSPR rearrangements. Use of this option may allow the use of somewhat larger values of limsprrange.
<commands>distanceswapbias = 1.0 ;</commands>
distanceswapbias (0.1 to 10, 1.0) – This option is similar to uniqueswapbias, except that it biases toward certain swaps based on the topological distance between the initial and rearranged trees. The distance is measured as in the limsprrange, and is half the the Robinson-Foulds distance between the trees. As with uniqueswapbias, distanceswapbias assigns a relative weight to each potential swap. In this case the weight is (distanceswapbias) raised to the (reconnection distance -1) power. Thus, given a value of 0.5, the weight of an NNI is 1.0, the weight of an SPR with distance 2 is 0.5, with distance 3 is 0.25, etc. Note that values less than 1.0 bias toward more localized swaps, while values greater than 1.0 bias toward more extreme swaps. Also note that this bias is only applied to limSPR rearrangements. Be careful in setting this, as extreme values can have a very large effect.
<commands>modweight = 0.05 ;</commands>
modweight (0 to infinity, 0.05) The prior weight assigned to the class of model mutations. Note that setting this at 0.0 fixes the model during the run.
<commands>gammashapemodel = 1000 ;</commands>
gammashapemodel (50 to 2000, 1000) -The shape parameter of the gamma distribution (with a mean of 1.0) from which the model mutation multipliers are drawn for model parameters mutations. Larger numbers cause smaller changes in model parameters. (Note that this has nothing to do with gamma rate heterogeneity.)
<commands>brlenweight = 0.2 ;</commands>
brlenweight ((0 to infinity, 0.2) The prior weight assigned to branch-length mutations.
<commands>meanbrlenmuts = 5 ;</commands>
meanbrlenmuts (1 to # taxa, 5) -The mean of the binomial distribution from which the number of branch lengths mutated is drawn during a branch length mutation.
<commands>gammashapebrlen = 1000 ;</commands>
gammashapebrlen (50 to 2000, 1000) -The shape parameter of the gamma distribution
(with a mean of 1.0) from which the branch-length multipliers are drawn for branch-length mutations. Larger numbers cause smaller changes in branch lengths. (Note that this has nothing to do with gamma rate heterogeneity.)
<commands>intervallength = 100 ;</commands>
intervallength (10 to 1000, 100) – The number of generations in each interval during which the number and benefit of each mutation type are stored.
<commands>intervalstostore = 5 ;</commands>
intervalstostore = (1 to 10, 5) – The number of intervals to be stored. Thus, records of mutations are kept for the last (intervallength x intervalstostore) generations. Every intervallength generations the probabilities of the mutation types are updated by the scheme described above.
<commands>startoptprec = 0.5 ;</commands>
startoptprec (0.005 to 5.0, 0.5)-The beginning optimization precision.
<commands>minoptprec = 0.01 ;</commands>
minoptprec (0.001 to startoptprec, 0.01)-The minimum allowed value of the optimization precision.
<commands>numberofprecreductions = 10 ;</commands>
numberofprecreductions (0 to 100, 40) – Specify the number of steps that it will take for the optimization precision to decrease from startoptprec to minoptprec. In version 0.95, the reduction from startoptprec to minoptprec is now linear, rather than geometric.
<commands>treerejectionthreshold = 50 ;</commands>
treerejectionthreshold (0 to 500, 50) – This setting controls which trees have more extensive branch-length optimization applied to them. All trees created by a branch swap receive optimization on a few branches that directly took part in the rearrangement. If the difference in score between the partially optimized tree and the best known tree is greater than treerejectionthreshold, no further optimization is applied to the branches of that tree. Reducing this value can significantly reduce runtimes, often with little or no effect on results. However, it is possible that a better tree could be missed if this is set too low. I recommend a fairly conservative (large) value.

