20 January 2021

Accelerated evolution of SARS-COV-2 in a person with immunodeficiency (with hands-on exercise)

 

 

 

Corona Update 20 January 2021




People with weakened immune systems are at higher risk of getting severely sick from SARS-CoV-2, the virus that causes covid-19 [1]. They may also remain infectious for a longer period of time than others with COVID-19. Recently, a group of 32 researchers published the results of viral genome sequencing of an immuno-compromised patient with Covid-19 [2].

To my knowledge, this study is the first where a sufficient number of whole-genome viral sequences were obtained from the same patient to establish the evolution of SARS-CoV-2 in one individual during a longer period of time in which anti-viral therapy was given, such as two remdesivir treatments and a SARS-CoV-2 antibody cocktail against the SARS-CoV-2 spike protein (Regeneron) [3] and more.  

In the study 9 whole-genome viral sequencing, 19 RT-PCR tests and a number of quantitative viral load assays of SARS-CoV-2 were done during a period of 154 days (5 months). After many clinical complications such as hypoxemia and remdesivir treatments, the patient died on day 154 from shock and respiratory failure.

In this blog I will focus on the whole genome sequences. The authors did a phylogenetic analysis of the viral sequences which was consistent with persistent infection and accelerated evolution of the virus. 

The authors conclude: "Although most immunocompromised persons effectively clear SARS-CoV-2 infection, this case highlights the potential for persistent infection and  accelerated  viral  evolution  associated  with  an immunocompromised state". 

This situation may be comparable to the continued use of antibiotics in humans or animals to combat bacterial infections. When the selection pressure is not too strong, not every microbe is killed. A few lucky mutants escape and multiply. When the 'right' mutations occur, antibiotic resistance soon develops [8].

 

The mutations...

For those who insist to know the exact mutations, here they are:

Fig 1a. Partial Spike sequences. continued below.

Fig 1b. Partial Spike sequences (continued). Click to enlarge.
Amino Acids in colored letters. Dots: identical AA.
Numbers above are positions in Spike protein
For complete Spike sequence see below: How to ...

 

Generally, mutations originate in random places in the genome. But the authors found that amino acid changes were pre-dominantly in the Spike gene (S) and the receptor-binding domain (RBD), which make up 13% and 2% of the viral genome, respectively, but harboured 57% and 38% of the observed changes. That means mutations in the Spike protein were much higher than expected by chance alone. Selection must have taken place: negative (removing mutated variants) and positive (amplifying mutated variants) [4].

Several synonymous as well as non-synonymous mutations in the virus were detected, as well as continuous deletions of 3, 4 and 7 amino acids (AA). One variant has a 7 AA and a 3 AA deletion (see Fig 1a), so its sequence is 10 AA or 30 nt shorter. Apparently, not very harmful. Remarkably, all mutations in the Spike protein are non-synonymous (see figures). That means amino acids are replaced. In synonymous mutations the amino acid does not change, so are of little consequence.

The top row in the figure is the first virus genome in the patient that has been sequenced (day 18). It is identical to Reference sequence (RefSeq) of the Spike protein, except for one substitution N869M in position 869: M has been substituted by N. So, the patient started with a standard virus. But then things changed rapidly. A total of 60 non-synonymous mutations and 15 synonymous mutations were found. That is 4 times as much non-synonymous than synonymous mutations. For comparison: the New Brazilian variant has 24 mutations!

Here is a list of 8 mutations found on the last sampling day (day 152):

del 141-144 [9]

Q183H

T478K

E484A [6]

F486I

S494P

N501Y [5],[7]

I870V

The first mutations appear on day 75 and all virus variants thereafter accumulate mutations. We see a familiar mutation: N501Y which occurs in the British B.1.17 [5] and South Africa variant [7].

The table Fig 1ab shows only one sequence per day. However, there must exist more sequences at the same time. 

Fig 1c. On Day 152 seven lost amino acids reappear.
 

For example, in Fig. 1c. we see a large 7 amino acids deletion (position 12-18) in the Spike protein on Day 146. A week later, on Day 152, the deleted amino acids reappear. Any mutated amino acid can mutate back to the wildtype, but deleted amino acids cannot reappear. So, they must be inherited from other variants without the deletion. But they are not shown in the table. The table is incomplete. So, different sub-populations must exist side-by-side and evolve independently. See the phylogenetic tree:

Phylogenetic tree of virus populations within one patient [1].

This small tree shows evolutionary diversification. As time progresses more lineages appear. At time T0 there are two lineages and at T3 there are 3 different lineages. Unfortunately, there are not enough data to establish which variants went extinct and how many variants survived until the last day. But, then, this is a patient, not an experiment.

 

Conclusion

The pharmacological cocktail in this patient with an weakened immune system seem to have caused accelerated evolution of SARS-CoV-2. The mutations are significantly more located in the Spike protein and there are more non-synonymous than synonymous mutations. In other words: micro-evolution with positive selection for reproductive success of the virus and adaptation to the internal environment of the patient. Maybe the discontinuous therapy has given the virus time to recover and thereby contributed to the accelerated evolution.


How to download a sequence from NCBI

 
To download the Spike protein of SARS-COV-2 (Surface glycoprotein):
  • click this link (SARS-CoV-2 Spike protein Reference is selected)
  • select the YP_009724390 (this is the reference sequence; 1273 AA)
  • press DOWNLOAD button 
  • step 1 'Protein' is pre-selected
  • Step 2: Next
  • Download Selected Records 
  • Step 3: 'Use default'. Next
  • Download 
  • the downloaded file is: sequences.fasta
  • this file is a plain text file and can be read in a simple text editor
  • the file is formatted in lines of 60 characters.
  • optional: remove first line and save as YP_009724390.txt

 
Overview of all mutations (red) in Spike protein.
Yellow highlight: sequences shown in the publication.
Used is the Reference sequence (1273 AA).
Formatted in lines of 100 characters. Click to enlarge.


I used the downloaded Reference sequence YP_009724390 of the Spike protein to show the mutations in context and to compare the sequences of the patient with the official reference sequence. Deletions are also shown in red. Yellow highlight are the sequences shown in the publication. This is an example how one can use downloaded sequences.
 
Alternative method [10]: 
 
click on YP_009724390 and scroll down and you find the amino acid (AA) sequence of the Reference Spike protein in rows of 60 in blocks of 10 per row:
the complete sequence of the Spike protein: 1273 AA
https://www.ncbi.nlm.nih.gov/protein/YP_009724390



References/Notes

 
- hypoxemia: an abnormally low level of oxygen in the blood
- A 7 amino acid (AA) deletion correspondents to a 21 nucleotide (nt) deletion.
- Synonymous mutation is a mutation in DNA/RNA that doesn't change the amino acid. A non-synonymous mutation does change the amino acid.

- Remdesivir is a broad-spectrum anti-viral molecule (wikipedia)

- NCBI = The National Center for Biotechnology Information

- RT-PCR = Reverse-Transcriptase–Polymerase-Chain-Reaction.
 
  1. People with weakened immune systems are at higher risk of getting severely sick from SARS-CoV-2, the virus that causes covid-19
  2. 32 authors: 'Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host' in The New England Journal of Medicine, November 11, 2020
  3. Experimental treatment for COVID-19 (Regeneron Pharmaceuticals)
  4. Positive selection has been reported by others:  Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function
  5. see a previous blog: Finding the highly transmissible British SARS-CoV-2 B.1.1.7 variant in the USA
  6. "The most consequential mutations, at a location called E484, caused a steep drop in the potency of some individuals’ antibodies. Coronavirus variants identified in South Africa and Brazil carry a mutation at the same spot." Nature.
  7. Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa. Posted December 22, 2020.
  8. The same effect is predicted to occur when the second covid-19 vaccination is delayed: Could delaying a second vaccine dose lead to more dangerous coronavirus strains?  Jan 14, 2021 (thank you Harry)
  9. correction: del142-144 must be del141-144 (4AA) on the last day. On previous days it was del142-144 (3AA).
  10. Update 24 Jan 2021: I added an alternative method to get the complete amino acid sequence without downloading it to your computer.

 

Thanks Marleen for notifying me to nrc article which referred to publication [2].

7 comments:

  1. Gert, stof voor een hernieuwde discussie?

    https://nymag.com/intelligencer/article/coronavirus-lab-escape-theory.html

    https://www.wired.com/story/if-covid-19-did-start-with-a-lab-leak-would-we-ever-know/

    ReplyDelete
  2. Gert,
    stop de persen!
    Must read: DOI: 10.1126/science.abd7331
    Het hing in de lucht - zie bijv ook Alphafold
    En dan dit, zoveel jaar na Murray Eden....!

    ReplyDelete
  3. Harry, tsjonge, tsjonge, dat artikel The Lab-Leak Hypothesis lijkt wel een boek, of tenminste een groot hoofdstuk met vele paragrafen!
    Kijk, het is zeer spannend verhaal, maar bestaat voor 99% uit speculatie en circumstantial evidence, het enige feit zou de RRAR volgorde zijn. Dit zit inderdaad in SARS-CoV-2 en moet uitzoeken of deze niet in SARS-COV-1 zit. Een insertie van 4 aminozuren lijkt mij niet onmogelijk, er zijn veel SARS-COV-2 varianten die (iets) langer zijn dan de standaard. Dat moet ik uitzoeken.

    Het artikel Learning the language of viral evolution and escape had ik al voorbij zien komen: I am confused! wat is dit? wat moet ik hiermee! Ik kan hier niets mee. Laat ik even liggen!

    ReplyDelete
  4. gert,

    ja, verbaast me niks. Dit is andere koek, maar, ben maar niet bang, je zult hier vast nog meer van gaan horen.
    we spreken mekaar nog!

    ReplyDelete
  5. lees nou eerst dat verhaal maar eens, desnoods alleen het begin:

    The ability for viruses to mutate and evade the human immune system and cause infection, called viral escape, remains an obstacle to antiviral and vaccine development". ...With this approach, language models of influenza hemagglutinin, HIV-1 envelope glycoprotein (HIV Env), and severe acute respiratory syndrome coronavirus 2
    (SARS-CoV-2) Spike viral proteins can accurately predict structural escape patterns using sequence data alone.. ...

    DOI: 10.1126/science.abd7331

    ReplyDelete
  6. Harry, ik heb te veel geniale theorieën voorbij zien komen, mijn website staat er vol mee.
    "machine learning algorithms ...": dat is prima, maar die moeten getraind worden en daar hebben de auteurs het ook over, en dan moet je correcte voorbeelden van viral escape geven. die moet je dus van te voren al weten. Als vaccin ontwikkelaars onder de indruk zijn van hun methode en resultaten en hun resultaten toepassen in nieuwe vaccins, dan zal ik mij in hun methode verdiepen. OK?

    ReplyDelete

Comments to posts >30 days old are being moderated.
Safari causes problems, please use Firefox or Chrome for adding comments.