CIPRES

PREQUAL on XSEDE

PREQUAL (PRE-alignment QUALity filter) provides a command line tool for examining a group of homologous sequences and filter (remove or mask) characters or stretches within individual sequences that are unlikely to share a common ancestor (homology) with any other character in any other sequence in the group. In practice this filtering targets stretches of the sequences that could be considered a sequence specific insertion, including sequencing errors, inversions, frameshifts, and anything else that means that any stretch of a sequence has no simple homology with a stretch in any other sequence. PREQUAL works on unaligned sequences, in other words it filters sequences before multiple sequence alignment. It uses a sophisticated probabilistic modelling approach that does not assume any fixed sequence alignment. PREQUAL works with amino acid sequences, but can also handle DNA sequences of protein-coding genes. The intent is comparable to Guidance2, but PREQUAL uses a different strategy. The manual provides an overview of the methodology – including its strengths and weaknesses – and how to apply it to your data.

CIPRES allows you to choose the memory for the run to allow for larger datasets to be analzyed. However, increasing memory will increase the cost of the run in cpu hours.
This is because more cores must be idled in the node to get the memory you need.

Input files: Prequal accepts as input a fasta file of unaligned sequences before alignment. It is not for use on alignments. For aligned sequnces consider Divvier.

Output files:
The table below shows the kinds of results returned by CIPRES Science Gateway:

Input File Names	Sample File from a Test
input file (aligned fasta)	prequal_infile.fas

Sample Output File Type	File Name
Output, filtered sequences in fasta	output.prequal.fas
detail output	output.prequal.detail.txt
Warning output file	output.prequal.warning.txt
log file	stdout.prequal.txt

If you use PREQUAL here, please cite:

Simon Whelan, Iker Irisarri, Fabien Burki, PREQUAL: detecting non-homologous characters in sets of unaligned homologous sequences, Bioinformatics, Volume 34, Issue 22, 15 November 2018, Pages 3929–3930, doi: 10.1093/bioinformatics/bty448

If there is a tool or a feature you need, please let us know.

Prequal

PREQUAL on XSEDE

Get 1000 Hours free