Corona Update 20 January 2021
People with weakened immune systems are at higher risk of getting severely
sick from SARS-CoV-2, the virus that causes covid-19 [1]. They may also remain
infectious for a longer period of time than others with COVID-19. Recently, a group of 32 researchers published the results of viral genome sequencing of an immuno-compromised patient with Covid-19 [2].
To my knowledge, this study is the first where a sufficient number of
whole-genome viral sequences were obtained from the same patient to establish the evolution of SARS-CoV-2 in one individual during a longer period of time in which anti-viral therapy was
given, such as two remdesivir treatments and a SARS-CoV-2 antibody cocktail against the
SARS-CoV-2 spike protein (Regeneron) [3] and more.
In the study 9 whole-genome viral sequencing, 19 RT-PCR tests and a number of
quantitative viral load assays of SARS-CoV-2 were done during a period of 154
days (5 months). After many clinical complications such as hypoxemia and remdesivir treatments, the patient died on day 154 from shock and respiratory
failure.
In this blog I will focus on the whole genome sequences. The authors did a phylogenetic analysis of the viral sequences which was consistent with
persistent infection and accelerated evolution of the virus.
The authors conclude: "Although most immunocompromised persons effectively
clear SARS-CoV-2 infection, this case highlights the potential for persistent
infection and accelerated viral evolution
associated with an immunocompromised state".
This situation may be comparable to the continued use of antibiotics in
humans or
animals to combat bacterial infections. When the selection pressure is
not too strong, not every microbe is killed. A few lucky mutants
escape and multiply. When the 'right' mutations occur, antibiotic resistance soon develops [8].
The mutations...
For those who insist to know the exact mutations, here they are:
|
Fig 1a. Partial Spike sequences. continued below.
|
|
Fig 1b. Partial Spike sequences (continued). Click to enlarge. Amino Acids in colored letters. Dots: identical AA. Numbers above are positions in Spike protein For complete Spike sequence see below: How to ...
|
Generally, mutations originate in random places in the genome. But the authors found that amino
acid changes were pre-dominantly in the Spike
gene (S) and the receptor-binding domain (RBD), which make
up 13% and 2% of the viral genome, respectively, but harboured 57% and 38% of
the observed changes. That means mutations in the Spike protein were much higher than expected by chance alone. Selection must have taken place: negative (removing mutated variants) and positive (amplifying mutated variants) [4].
Several synonymous as well as non-synonymous mutations in the virus were detected, as well
as continuous deletions of 3, 4 and 7 amino acids (AA). One variant has a 7 AA and a 3 AA deletion (see Fig 1a), so its sequence is 10 AA or 30 nt shorter. Apparently, not very harmful. Remarkably, all mutations in
the Spike protein are non-synonymous (see figures). That means amino acids are replaced. In synonymous mutations the amino acid does not change, so are of little consequence.
The top row in the figure is the first virus genome in the patient that has been sequenced (day 18). It is identical to Reference sequence (RefSeq) of the Spike protein,
except for one substitution N869M in position 869: M has been substituted by N. So, the patient started with a standard virus. But then things changed rapidly. A total of 60 non-synonymous mutations and 15 synonymous mutations were found. That is 4 times as much non-synonymous than synonymous mutations. For comparison: the New Brazilian variant has 24 mutations!
Here is a list of 8 mutations found on the last sampling day (day 152):
del 141-144 [9]
Q183H
T478K
E484A [6]
F486I
S494P
N501Y [5],[7]
I870V
The first mutations appear on day 75 and all virus variants thereafter accumulate mutations. We see a familiar mutation: N501Y which occurs in the British B.1.17 [5] and South Africa variant [7].
The table Fig 1ab shows only one sequence per day. However, there must exist more sequences at the same time.
|
Fig 1c. On Day 152 seven lost amino acids reappear.
|
For example, in Fig. 1c. we see a large 7 amino acids deletion (position 12-18) in the Spike protein on Day 146. A week later, on Day 152, the deleted amino acids reappear. Any mutated amino acid can mutate back to the wildtype, but deleted amino acids cannot reappear. So, they must be inherited from other variants without the deletion. But they are not shown in the table. The table is incomplete. So, different sub-populations must exist side-by-side and evolve independently. See the phylogenetic tree:
|
Phylogenetic tree of virus populations within one patient [1].
|
This small tree shows evolutionary diversification. As time progresses more lineages appear. At time T0 there are two lineages and at T3 there are 3 different lineages. Unfortunately, there are not enough data to establish which variants went extinct and how many variants survived until the last day. But, then, this is a patient, not an experiment.
Conclusion
The pharmacological cocktail in this patient with an weakened immune system seem to have caused accelerated evolution of SARS-CoV-2. The mutations are significantly more located in the Spike protein and there are more non-synonymous than synonymous mutations. In other words: micro-evolution with positive selection for reproductive success of the virus and adaptation to the internal environment of the patient. Maybe the discontinuous therapy has given the virus time to recover and thereby contributed to the accelerated evolution.
How to download a sequence from NCBI
To download the Spike protein of SARS-COV-2 (Surface glycoprotein):
-
click
this link (SARS-CoV-2 Spike protein Reference is selected)
- select the YP_009724390 (this is the reference sequence; 1273 AA)
- press DOWNLOAD button
- step 1 'Protein' is pre-selected
- Step 2: Next
- Download Selected Records
- Step 3: 'Use default'. Next
- Download
- the downloaded file is: sequences.fasta
-
this file is a plain text file and can be read in a simple text editor
- the file is formatted in lines of 60 characters.
- optional: remove first line and save as YP_009724390.txt
|
Overview of all mutations (red) in Spike protein. Yellow highlight: sequences shown in the publication. Used is the Reference sequence (1273 AA). Formatted in lines of 100 characters. Click to enlarge. |
I used the downloaded Reference sequence YP_009724390 of the Spike protein to show the mutations in context and to compare the sequences of the patient with the official reference sequence. Deletions are also shown in red. Yellow highlight are the sequences shown in the publication. This is an example how one can use downloaded sequences.
Alternative method [10]:
click on YP_009724390 and scroll down and you find the amino acid (AA) sequence of the Reference Spike protein in rows of 60 in blocks of 10 per row: |
the complete sequence of the Spike protein: 1273 AA https://www.ncbi.nlm.nih.gov/protein/YP_009724390 |
References/Notes
- hypoxemia: an
abnormally low level of oxygen in the blood
- A 7 amino acid (AA) deletion correspondents to a 21 nucleotide (nt) deletion.
- Synonymous mutation is a mutation in DNA/RNA that doesn't change the amino acid. A non-synonymous mutation does change the amino acid.
- Remdesivir is a broad-spectrum anti-viral molecule (wikipedia)
- NCBI = The National Center for Biotechnology
Information
- RT-PCR = Reverse-Transcriptase–Polymerase-Chain-Reaction.