25 January 2021

Accidental discovery of a second immuno-compromised patient with accelerated viral evolution



 Corona Update 25 January 2021

I discovered a second immuno-compromised patient with accelerated evolution by accident. I wanted to know whether the spontaneous mutations occurring in the first immuno-compromised patient could also be found in the general human population. I searched for Spike protein mutations in the NCBI SARS-CoV-2 sequence database. I selected all sequences of 1269 AA (4 amino acids shorter than the standard 1273 Spike protein). I added the standard Spike protein length of 1273 AA until I hit the maximum number of 500 sequences that are allowed in one search. The result: 27 sequences of length 1269 of which 16 showed the deletion 141-144 (see previous blog). Unexpectedly, two of them showed me the way to a second immuno-compromised patient (Fig. 1).

Fig 1. Two new sequences with the 141-144 deletion: QNQ32127; QNQ32151

Fig 2. The publication that describes the virus sequence (source).

Usually the sequences in the database are a 'Direct Submission'. They are not published. But the source of these new sequences (Fig.1) revealed that they were part of a

'Case Study: Prolonged infectious SARS-CoV-2 shedding from an asymptomatic immunocompromised cancer patient' [1]. 

That's how I discovered my second immuno-compromised patient.

Fig. 3. Long-term SARS-CoV-2 shedding [1] with within-patient variation

It is an immuno-compromised individual persistently testing positive for SARS-CoV-2. Remarkably: it is an asymptomatic individual! The virus mutated and created genetic diversity. This cannot be explained by contamination or secondary infection because the viral genomes of this patient cluster as a mono-phyletic clade*).


This strongly suggests evolution

The authors state: "Throughout the course of infection, there was marked within-host genomic evolution of SARS-CoV-2. Deep sequencing revealed a continuously changing virus population structure with turnover in the relative frequency of the observed genotypes over the course of infection. (...) Potential factors contributing to the observed within-host evolution is prolonged infection and the compromised immune status of the host, possibly resulting in a different set of selective pressures compared with an immune-competent host. These differential selective pressures may have allowed a larger genetic diversity with continuous turnover of dominant viral species throughout the course of infection." [1],[2].

The convalescent plasma therapy was not successful. But it is expected to be a selective pressure on the virus.

Apart from demonstrating evolution, there is an important public health lesson: "an estimated 3 million people in the United States have some form of immuno-compromising condition, including individuals with HIV infection".


The mutations

Fig. 4. Two deletions. red arrow: 21 nt. yellow: 12 nt. (click to enlarge)
first row (black) shows nt, second row shows AA (colored)

Two in-frame*) deletions were observed in the Spike glycoprotein coding region:

1) A 21 nt in-frame deletion (residues 21,975–21,995) was found in the N-terminal domain (NTD) of S1, leading to a 7-amino-acid deletion (amino acids [AA] 139–145) 

2) A 12 nt deletion (residues 21,982–21,993) was detected in the day 70 isolate, leading to a 4-AA deletion (AA 141–144) in the NTD.

As can be seen from Fig. 4 the two deletion strains disappear later.

Then there is the famous N501Y substitution which is found in the British and the South Africa variant (wikipedia). Isn't remarkable that this mutation originates independently and spontaneously in a single individual and at the same time in the world population? It could be that is a coincident, but also that it has a competitive advantage.

*) Abbr

Abbr = abbreviations.

nt = nucleotides or bases. Three bases code for 1 AA.

AA = amino acids. 

mono-phyletic clade = an evolutionary group of organisms with one common ancestor.

in-frame deletions = deletions in DNA/RNA that leave the codons (triplets) intact: for example a 3 or 6 base deletion which removes only intact  codons. A 1 or 2 base deletion is out-of-frame and causes troubles.

viral shedding =  the release of virus particles (in the air) (wikipedia)

The authors did not state explicitly, but these results could -just as the patient in my previous blog- be compared with anti-biotic resistance after an unsuccessful anti-biotic treatment.


  1. Case Study: Prolonged infectious SARS-CoV-2 shedding from an asymptomatic immunocompromised cancer patient. 23 Dec 2020
  2. There was no selection effect detected in in vitro experiments of mutated virus strains.

20 January 2021

Accelerated evolution of SARS-COV-2 in a person with immunodeficiency (with hands-on exercise)




Corona Update 20 January 2021

People with weakened immune systems are at higher risk of getting severely sick from SARS-CoV-2, the virus that causes covid-19 [1]. They may also remain infectious for a longer period of time than others with COVID-19. Recently, a group of 32 researchers published the results of viral genome sequencing of an immuno-compromised patient with Covid-19 [2].

To my knowledge, this study is the first where a sufficient number of whole-genome viral sequences were obtained from the same patient to establish the evolution of SARS-CoV-2 in one individual during a longer period of time in which anti-viral therapy was given, such as two remdesivir treatments and a SARS-CoV-2 antibody cocktail against the SARS-CoV-2 spike protein (Regeneron) [3] and more.  

In the study 9 whole-genome viral sequencing, 19 RT-PCR tests and a number of quantitative viral load assays of SARS-CoV-2 were done during a period of 154 days (5 months). After many clinical complications such as hypoxemia and remdesivir treatments, the patient died on day 154 from shock and respiratory failure.

In this blog I will focus on the whole genome sequences. The authors did a phylogenetic analysis of the viral sequences which was consistent with persistent infection and accelerated evolution of the virus. 

The authors conclude: "Although most immunocompromised persons effectively clear SARS-CoV-2 infection, this case highlights the potential for persistent infection and  accelerated  viral  evolution  associated  with  an immunocompromised state". 

This situation may be comparable to the continued use of antibiotics in humans or animals to combat bacterial infections. When the selection pressure is not too strong, not every microbe is killed. A few lucky mutants escape and multiply. When the 'right' mutations occur, antibiotic resistance soon develops [8].


The mutations...

For those who insist to know the exact mutations, here they are:

Fig 1a. Partial Spike sequences. continued below.

Fig 1b. Partial Spike sequences (continued). Click to enlarge.
Amino Acids in colored letters. Dots: identical AA.
Numbers above are positions in Spike protein
For complete Spike sequence see below: How to ...


Generally, mutations originate in random places in the genome. But the authors found that amino acid changes were pre-dominantly in the Spike gene (S) and the receptor-binding domain (RBD), which make up 13% and 2% of the viral genome, respectively, but harboured 57% and 38% of the observed changes. That means mutations in the Spike protein were much higher than expected by chance alone. Selection must have taken place: negative (removing mutated variants) and positive (amplifying mutated variants) [4].

Several synonymous as well as non-synonymous mutations in the virus were detected, as well as continuous deletions of 3, 4 and 7 amino acids (AA). One variant has a 7 AA and a 3 AA deletion (see Fig 1a), so its sequence is 10 AA or 30 nt shorter. Apparently, not very harmful. Remarkably, all mutations in the Spike protein are non-synonymous (see figures). That means amino acids are replaced. In synonymous mutations the amino acid does not change, so are of little consequence.

The top row in the figure is the first virus genome in the patient that has been sequenced (day 18). It is identical to Reference sequence (RefSeq) of the Spike protein, except for one substitution N869M in position 869: M has been substituted by N. So, the patient started with a standard virus. But then things changed rapidly. A total of 60 non-synonymous mutations and 15 synonymous mutations were found. That is 4 times as much non-synonymous than synonymous mutations. For comparison: the New Brazilian variant has 24 mutations!

Here is a list of 8 mutations found on the last sampling day (day 152):

del 141-144 [9]



E484A [6]



N501Y [5],[7]


The first mutations appear on day 75 and all virus variants thereafter accumulate mutations. We see a familiar mutation: N501Y which occurs in the British B.1.17 [5] and South Africa variant [7].

The table Fig 1ab shows only one sequence per day. However, there must exist more sequences at the same time. 

Fig 1c. On Day 152 seven lost amino acids reappear.

For example, in Fig. 1c. we see a large 7 amino acids deletion (position 12-18) in the Spike protein on Day 146. A week later, on Day 152, the deleted amino acids reappear. Any mutated amino acid can mutate back to the wildtype, but deleted amino acids cannot reappear. So, they must be inherited from other variants without the deletion. But they are not shown in the table. The table is incomplete. So, different sub-populations must exist side-by-side and evolve independently. See the phylogenetic tree:

Phylogenetic tree of virus populations within one patient [1].

This small tree shows evolutionary diversification. As time progresses more lineages appear. At time T0 there are two lineages and at T3 there are 3 different lineages. Unfortunately, there are not enough data to establish which variants went extinct and how many variants survived until the last day. But, then, this is a patient, not an experiment.



The pharmacological cocktail in this patient with an weakened immune system seem to have caused accelerated evolution of SARS-CoV-2. The mutations are significantly more located in the Spike protein and there are more non-synonymous than synonymous mutations. In other words: micro-evolution with positive selection for reproductive success of the virus and adaptation to the internal environment of the patient. Maybe the discontinuous therapy has given the virus time to recover and thereby contributed to the accelerated evolution.

How to download a sequence from NCBI

To download the Spike protein of SARS-COV-2 (Surface glycoprotein):
  • click this link (SARS-CoV-2 Spike protein Reference is selected)
  • select the YP_009724390 (this is the reference sequence; 1273 AA)
  • press DOWNLOAD button 
  • step 1 'Protein' is pre-selected
  • Step 2: Next
  • Download Selected Records 
  • Step 3: 'Use default'. Next
  • Download 
  • the downloaded file is: sequences.fasta
  • this file is a plain text file and can be read in a simple text editor
  • the file is formatted in lines of 60 characters.
  • optional: remove first line and save as YP_009724390.txt

Overview of all mutations (red) in Spike protein.
Yellow highlight: sequences shown in the publication.
Used is the Reference sequence (1273 AA).
Formatted in lines of 100 characters. Click to enlarge.

I used the downloaded Reference sequence YP_009724390 of the Spike protein to show the mutations in context and to compare the sequences of the patient with the official reference sequence. Deletions are also shown in red. Yellow highlight are the sequences shown in the publication. This is an example how one can use downloaded sequences.
Alternative method [10]: 
click on YP_009724390 and scroll down and you find the amino acid (AA) sequence of the Reference Spike protein in rows of 60 in blocks of 10 per row:
the complete sequence of the Spike protein: 1273 AA


- hypoxemia: an abnormally low level of oxygen in the blood
- A 7 amino acid (AA) deletion correspondents to a 21 nucleotide (nt) deletion.
- Synonymous mutation is a mutation in DNA/RNA that doesn't change the amino acid. A non-synonymous mutation does change the amino acid.

- Remdesivir is a broad-spectrum anti-viral molecule (wikipedia)

- NCBI = The National Center for Biotechnology Information

- RT-PCR = Reverse-Transcriptase–Polymerase-Chain-Reaction.
  1. People with weakened immune systems are at higher risk of getting severely sick from SARS-CoV-2, the virus that causes covid-19
  2. 32 authors: 'Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host' in The New England Journal of Medicine, November 11, 2020
  3. Experimental treatment for COVID-19 (Regeneron Pharmaceuticals)
  4. Positive selection has been reported by others:  Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function
  5. see a previous blog: Finding the highly transmissible British SARS-CoV-2 B.1.1.7 variant in the USA
  6. "The most consequential mutations, at a location called E484, caused a steep drop in the potency of some individuals’ antibodies. Coronavirus variants identified in South Africa and Brazil carry a mutation at the same spot." Nature.
  7. Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa. Posted December 22, 2020.
  8. The same effect is predicted to occur when the second covid-19 vaccination is delayed: Could delaying a second vaccine dose lead to more dangerous coronavirus strains?  Jan 14, 2021 (thank you Harry)
  9. correction: del142-144 must be del141-144 (4AA) on the last day. On previous days it was del142-144 (3AA).
  10. Update 24 Jan 2021: I added an alternative method to get the complete amino acid sequence without downloading it to your computer.


Thanks Marleen for notifying me to nrc article which referred to publication [2].

16 January 2021

Is het virus ooit geïsoleerd, geïdentificeerd, en gepurificeerd?




Corona Update 16 januari 2021

"Is het virus ooit geïsoleerd, geïdentificeerd, en gepurificeerd?"

Gisteravond in een youtube lifestream van Vincent Everts en Charles Groenhuizen werden een apotheker, een farmacoloog, een viroloog en een huisarts geïnterviewd over de mogelijke toelating van Ivermectin als behandeling van covid-19. Kijkers konden vragen stellen. Ik viel van mijn stoel toen ik de volgende vraag (t=3790) voorbij zag komen:

"Is het virus ooit geïsoleerd, geïdentificeerd, en gepurificeerd?"

De SARS-CoV-2 pandemie begon eind december 2019 in Wuhan. Op 15 januari 2021, dus ruim een jaar later vraagt iemand:

"Is het virus ooit geïsoleerd, geïdentificeerd, en gepurificeerd?"

Ik neem aan dat de vragensteller te goeder trouw is, en het een eerlijke vraag is van iemand die het gewoon niet weet, maar wellicht beïnvloed is door complotdenkers op sociale media. De viroloog Ab Osterhaus mocht de vraag beantwoorden. Hij deed dat kort en duidelijk. Het volledige SARS-CoV-2 virus genoom is op 10 januari 2020 door Chinese onderzoekers gepubliceerd. Dat was zeer belangrijk voor de rest van de wereld. Op basis van die publicatie konden in het Westen PCR testen ontwikkeld worden om SARS-CoV-2 bij mensen aan te tonen en de farmaceutische industrie kon beginnen met het samenstellen van een vaccin tegen het virus. Vroeger beschikten wetenschappers niet over het volledige genoom van virussen. Het is een enorme vooruitgang dat dat tegenwoordig wel kan.

Osterhaus zei er niet bij dat het RNA van meer dan 50.000 individuele SARS-CoV-2 virussen bekend is. Dus het volledige genoom tot op de laatste letter is bekend en gepubliceerd. Dit is te zien in openbare databases als GISAID en NCBI. [1]

Het virus is ook niet 100% nieuw zoals de naam al doet vermoeden: het is versie 2. Versie 1, SARS-CoV-1 veroorzaakte een kleinere epidemie in 2003. Ze behoren beide tot de familie van coronavirussen. Dit kun je allemaal in de wikipedia vinden. En dat is niet moeilijk te vinden.

Ik zal in een volgend blog laten zien hoe iedereen zelf de RNA volgorde van SARS-CoV-2 kan downloaden als tekstbestand. Dan hoef je de vraag Is het SARS virus ooit geïsoleerd, geïdentificeerd, en gepurificeerd? niet meer te stellen! Nooit meer!


  1. "In the past year, more than 360,000 SARS-CoV-2 genomes have been sequenced and stored on GISAID, a non-profit online database for sharing viral genomes ... covering more than 140 countries." Nature, 15 Jan 2021.


LIVE 16:00 Ab Osterhaus en Adam Cohen over toelating Ivermectin voor Covid-19 behandeling

One year since first genomes of SARS-CoV-2 released to the world 10 January 2020 00:41UTC

Global analysis of more than 50,000 SARS-CoV-2 genomes reveals epistasis between eight viral genes