CIPRES team describes parallel computing model for Multiple Instruction Multiple Data (MIMD) architecture; outperforms best RECIDCM3 score to date.
Du Z, Lin F, Roshan UW. Reconstruction of large phylogenetic trees: A parallel approach. [Comput Biol Chem. 2005 Aug;29(4):273-80]: Presents a parallel computing model for reconstructing phylogenetic trees for very large datasets using the widely used Multiple Instruction Multiple Data (MIMD) architecture. The model adapts the recursive-DCM3 decomposition method of Roshan et al to divide datasets into smaller subproblems, and "greatly reduces the computational time of the sequential version of the program." In a case study, the parallel approach took 22 hours on four processors to outperform the best score to date, according to the authors.
Abstract: Reconstruction of phylogenetic trees for very large datasets is a known example of a computationally hard problem. In this paper, we present a parallel computing model for the widely used Multiple Instruction Multiple Data (MIMD) architecture. Following the idea of divide-and-conquer, our model adapts the recursive-DCM3 decomposition method [Roshan, U., Moret, B.M.E., Williams, T.L., Warnow, T, (2004). Performance of suptertree methods on various dataset decompositions. In: Binida-Emonds, O.R.P. (Eds.), Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, vol. 3 of Computational Biology, Kluwer Academics, pp. 301-328; Roshan, U., Moret, B.M.E., Williams, T.L., Warnow, T., (2004). Rec-I-DCM3: A Fast Algorithmic Technique for reconstructing large phylogenetic trees, Proceedings of the IEEE Computational Systems Bioinformatics Conference (ICSB)] to divide datasets into smaller sub-problems. It distributes computation load over multiple processors so that each processor constructs sub-trees on each sub-problem within a batch in parallel. It finally collects the resulting trees and merges them into a supertree. The proposed model is flexible as far as methods for dividing and merging datasets are concerned. We show that our method greatly reduces the computational time of the sequential version of the program. As a case study, our parallel approach only takes 22.1 h on four processors to outperform the best score to date (Found at 123.7 h by the Rec-I-DCM3 program [Roshan, U., Moret, B.M.E., Williams, T.L., Warnow, T, 2004a. Performance of supertree methods on various dataset decompositions. In: Binida-Emonds, O.R.P. (Eds.), Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, vol. 3 of Computational Biology, Kluwer Academics, pp. 301-328; Roshan, U., Morel, B.M.E., Williams, T.L., Warnow, T., 2004b. Rec-I-DCM3: A Fast Algorithmic Technique for reconstructing large phylogenetic trees, Proceedings of the IEEE Computational Systems Bioinformatics Conference (ICSB)] on one dataset. Developed with the standard message-passing library, MPI, the program can be recompiled and run on any MIMD systems. (c) 2005 Elsevier Ltd. All rights reserved.
Author Keywords: phylogenetic tree; maximum parsimony; parallel; divide-and-conquer; MIMD
KeyWords Plus: MAXIMUM-LIKELIHOOD APPROACH; SEQUENCES; DATABASE; VERSION
Addresses: Du ZH (reprint author), Nanyang Technol Univ, BioInformat Res Ctr, Nanyang Ave, Singapore, 639798 Singapore
Nanyang Technol Univ, BioInformat Res Ctr, Singapore, 639798 Singapore
New Jersey Inst Technol, Dept Comp Sci, Coll Comp Sci, Newark, NJ 07102 USA
E-mail Addresses: duzhihua@pmail.ntu.edu.sg
Publisher: ELSEVIER SCI LTD, THE BOULEVARD, LANGFORD LANE, KIDLINGTON, OXFORD OX5 1GB, OXON, ENGLAND
Subject Category: COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS; BIOLOGY
IDS Number: 956XI
ISSN: 1476-9271

