The Exelixis Lab

Enabling Research in Evolutionary Biology

Sweed, a faster version of SweepFinder

We developed SweeD, a parallel and checkpointable tool that implements a composite likelihood ratio test for detecting selective sweeps. SweeD is based on the SweepFinder algorithm (Nielsen et al. 2005). SweeD can calculate the theoretical SFS of a given demographic model (stepwise changes or with an exponential growth phase + stepwise changes) by using the method by Živković and Stephan (2011).

When using Sweed, please cite the following paper P. Pavlidis, D. Zivkovic, A. Stamatakis, N. Alachiotis: "SweeD: Likelihood-based detection of selective sweeps in thousands of genomes''. In Molecular Biology and Evolution, 2013.

SweeD is numerically more stable than SweepFinder (in terms of floating-point arithmetic operations and in particular for folded data), and is faster than SweepFinder when the number of sequences is large. SweeD has been tested on simulated datasets with up to 10,000 sequences and 1,000,000 SNPs. The sequential version of SweeD is up to 21 times faster than SweepFinder, depending on the number of SNPs and the number of sequences. Performance improves over SweepFinder with an increasing number of sequences. For few sequences, SweeD is as fast as SweepFinder. SweeD has been also used to analyze the Chromosome 1 from the 1000 Genomes Project. The dataset comprises more than 2000 sequences and about 2,896,000 SNPs. The analysis required 8h and 15mins.

You can download the source code of version 3.2.1 here The most up to date version is always available on the assembla repository of our former PostDoc Pavlos Pavlidis.

The manual is available here

The experimental data and scripts used in the manuscript are available for download here

Bug history:
  • v3.1 Fixed bug in the VCF file format parser that was associated with handling missing
  • v3.1.1 Changed a default parameter value
  • v3.2.1 minor bug fixes and added an option -strictMonomorphic which removes sites where the site can potentially be monophric. This happens when (x == 0 || x ==n ) && n < sequences