Trevor Bedford: Genetic Data showing how COVID unfolded in US

2,043 Views | 4 Replies | Last: 5 yr ago by plain_o_llama
HotardAg07
How long do you want to ignore this user?
AG
This is a series of 14 tweets from @trvrb, I will copy/paste for ease of reading on the forum:

We now have enough #SARSCoV2 genomic data from different states to make some broad conclusions about how the #COVID19 epidemic has unfolded in the US.

We see a spectrum where some states had single (early) introductions that fueled the majority of the epidemic, while in others the epidemic appears to be driven by a larger number of separate introductions.

This analysis shows relationships from sequencing viral genomes. The main thing to pay attention to is how cases (dots colored by state) cluster. Patches of cases from the same state indicate local transmission. All figures from @nextstrain: https://nextstrain.org/ncov/north-america

These analyses depend on comparison of sequenced viruses and so equitable sampling over geography and time is necessary to make reliable conclusions. This sampling has improved recently, but there are still gaps in sequencing and caution in interpretation is warranted.

If we look at New York, we see that most infections appear to derive from an introduction from Europe in mid-February (node on the tree labeled "NY"). This introduction rapidly grew to a substantial epidemic focused in NYC, but frequently introduced to other locations.


While if we look at Washington we see an introduction of a lineage from China that drove much of the early outbreak, but then multiple introductions from the NY clade fueling further transmission chains in March.


California is interesting in that like WA, much of the outbreak is fueled by an early introduction of a lineage from Asia and then later cases derived from repeated introductions from elsewhere in the US.


Contrast this with Wisconsin, in which the outbreak appears driven by a large number of separate introductions from elsewhere in the US and Europe. It lacks the early successful introduction seen in NY, WA and CA.


Louisiana has a remarkably focused outbreak in which most sequenced cases appear to derive from a couple closely related introduction events. This outbreak nests within the genetic diversity seen in NY suggesting a possible transmission route.


Part of the Texas outbreak groups closely with this clade from LA, while other samples from Texas reveal a large number of separate introductions to the state.


Overall, we see substantial mixing of transmission chains across states. This mixing largely occurred during Feb and early March. I'd expect moving forward for there to be more geographically focused transmission chains given reduced travel in the US.

I see continued genetic sequencing as potentially illuminating sources of infection and whether further cases within a state are largely the result of local transmission or of repeated introductions from outside the state.

These inferences are the result of rapid sequencing and sharing by a large number of groups. A huge thank you to ..... [many people].
culdeus
How long do you want to ignore this user?
AG
Is there data on if any of these kill more than any other?
HotardAg07
How long do you want to ignore this user?
AG
He has a thread on that as well, from May 5th:

I wanted to address the hypothesis put forward in Korber et al (https://biorxiv.org/content/10.1101/2020.04.29.069054v1) that the mutation in spike protein D614G causes an increase in transmissibility of SARS-CoV-2 virus. I find this hypothesis to be plausible, but far from proven.

I've been watching D614G closely as mutations in spike protein deserve added attention due to spike's role in binding to the human ACE2 receptor.

This D614G mutation occurred in the transmission chain that initially seeded the European outbreak in ~Jan 2020. Almost all viruses possessing this mutation descend from this initial introduction into Europe.


European viruses are enriched for D614G because of a founder effect in which the initial introduction included this mutation. If we look at the geographic distribution of D vs G in sequenced viruses we see Europe with G, Asia largely D and a mix in the US and Australia.


The primary finding of Korber et al is that D614G appears to be increasing in frequency over time in sequenced SARS-CoV-2 genomes. I strongly caution against interpretation of selective effects in the global frequency of D614G.

Its global frequency is heavily confounded with epidemiological circumstance, ie perhaps G is prevalent because it got lucky in the European introduction. However, regional frequencies should be more robust to this confounding (although not perfectly so).

This figure recapitulates the Korber et al findings using @nextstrain. Here, I've shown states in the US and Australia with more than 70 sequences available and our estimate of frequency of D (green) vs G (yellow) from March 1 to April 15.


You can see that in every case the frequency of G increases in the course of these 45 days.


It's still very possible that this pattern could emerge from repeated introductions from Europe / NYC in the US spreading the G variant and multiple introductions from Europe spreading G in Australia.

The alternative explanation put forth by Korber et al is that this pattern is due to the G variant being more transmissible. I think this is possible, but it's difficult to distinguish between these hypotheses with this frequency data alone.

Additional evidence for the hypothesis of a functional effect of D614G comes from Korber et al's observation that the G variant has lower cycle threshold (Ct) value in clinical specimens from Sheffield. This indicates a possible higher viral load in these individuals.


Thanks to work by @wcassias and @pavitrarc , we see this difference in Ct between D and G replicated in
@UWVirology specimens. Preliminary analysis here: https://github.com/blab/ncov-D614G. ..


There are confounders to worry about here as well (primarily time from symptom onset to specimen collection), but I believe replication in two study populations is suggestive of an effect of D vs G on Ct value and possibly viral load.

Both Korber et al and our analysis show no measurable effect on patient outcome. Hence, the hypothesis at this point is entirely in terms of transmissibility rather than severity

Overall, I would refer everyone to @edyong209's piece on handling uncertainty during the pandemic (https://theatlantic.com/health/archive/2020/04/pandemic-confusing-uncertainty/610819/). I don't agree with takes that there is "no evidence" that G is more transmissible. There is some evidence, but it's far from conclusive.

We have to live with this uncertainty over the functional impact of D614G while more data is gathered. We need:
1. Cell culture studies to demonstrate effect in vitro
2. Further clinical comparisons between patients with D vs G
3. More careful epidemiological analysis
Windy City Ag
How long do you want to ignore this user?
AG
Quote:

These analyses depend on comparison of sequenced viruses and so equitable sampling over geography and time is necessary to make reliable conclusions. This sampling has improved recently, but there are still gaps in sequencing and caution in interpretation is warranted.

This is the nuance I have been looking for regarding the "When did it get here debate." I admit to not fully understanding the process of back-dating but there was an stance by many posters that we had already learned when it got here based on the genetic data.

This guy seems to be saying that you can't make such confident conclusions without broad based and multi-geographic sampling so any conclusion should be taken with a grain of salt.

Detmersdislocatedshoulder
How long do you want to ignore this user?
It was here earlier than they say. Not because I had a friend or my uncle had a cough in December but because it was in Wuhan in October. They didn't shut down air travel until January. How many people left Wuhan and traveled to the US or other countries? Probably millions, so if it is as contagious as they say and it seems to be it was already spreading. Not citing a study but using a little logic.
plain_o_llama
How long do you want to ignore this user?
Thanks for pulling all of this information together and posting it here.
Refresh
Page 1 of 1
 
×
subscribe Verify your student status
See Subscription Benefits
Trial only available to users who have never subscribed or participated in a previous trial.