Tag Archives: R

eLife paper and video on how HIV treatments affect selective sweeps

15 Feb

Very happy to announce that we have a new paper out and an accompanying video! The paper is about how effective treatments lead to (few) hard selective sweeps and bad treatments lead to soft selective sweeps.

The paper can be found here on the eLife website, but I suggest starting with the video that Alison Feder made.


Paper details

Title: More effective drugs lead to harder selective sweeps in the evolution of drug resistance in HIV-1.

Authors: Alison F Feder, Soo-Yon Rhee, Susan P Holmes, Robert W Shafer, Dmitri A Petrov, Pleuni S Pennings

DOI: http://dx.doi.org/10.7554/eLife.10670

Abstract: In the early days of HIV treatment, drug resistance occurred rapidly and predictably in all patients, but under modern treatments, resistance arises slowly, if at all. The probability of resistance should be controlled by the rate of generation of resistance mutations. If many adaptive mutations arise simultaneously, then adaptation proceeds by soft selective sweeps in which multiple adaptive mutations spread concomitantly, but if adaptive mutations occur rarely in the population, then a single adaptive mutation should spread alone in a hard selective sweep. Here, we use 6717 HIV-1 consensus sequences from patients treated with first-line therapies between 1989 and 2013 to confirm that the transition from fast to slow evolution of drug resistance was indeed accompanied with the expected transition from soft to hard selective sweeps. This suggests more generally that evolution proceeds via hard sweeps if resistance is unlikely and via soft sweeps if it is likely.


R code for calculating Jost D for MtDNA sequences

23 Oct

2018 Updated link to example files and code:  DROPBOX.

Earlier I posted a piece of R code to calculate Jost D and Gst and the associated p-values (using a permutation test) for MtDNA sequences. I repost it here in two versions. The first one calculates pairwise values (between all pairs of populations in your data) and the second one calculates one global value. You may wonder how & why we calculate Jost D for sequence data. We did it by reducing the sequences to alleles, so that two individuals either carry the same or a different allele. We ignore information about how different different alleles are. The advantage of this approach is that we can directly compare MtDNA statistics with microsatellite statistics and thus learn something about the differences between male and female dispersal. All of this is described in our JEB paper from 2011.

R code for global Jost D values

R code for pairwise Jost D values

Here is an example file that works with both of the R scripts.

Both Jost D and Gst depend on diversity of the markers. For sequence data, this means that the length of the sequences will affect the outcomes. If you use longer sequences, Gst will go down whereas Jost D will go up. This is equivalent to the effects seen in microsatellites. For more variable microsatellites Gst becomes very small, and Jost D becomes high. In the extreme case where every individual carries a different allele, Jost D will be 1, but a simple permutation test shows that the associated p-value is 1 as well.

This figure is similar to figure 4 of our 2011 JEB paper.

It shows how Jost D (indicated by D) and Gst (indicated by G) change with the length of the sequences analyzed. For each sequence length, I repeated the calculations and the permutation test 20 times. The P’s show the fraction of these repeats that let to a significant result. With longer sequences, it is more likely to detect significant population structure. Had the sequences been even longer, then P would have gone down again, because with very long sequences every individual is different.

There is now an R package, written by David Winter from the University of Otago, that allows you to calculate Jost D and other useful statistics for microsatellite data, find the paper here and the package here.

%d bloggers like this: