Evolution blog: Physicist Charles S. Cockell: "DNA and its entourage". The Genetic Code is non-random.

04 May 2026

Physicist Charles S. Cockell: "DNA and its entourage". The Genetic Code is non-random.

Evolution is the transformation of species into slightly modified species. The origin of species is solved in principle. However, the origin of life is a fundamentally different problem. Darwin avoided discussing the origin of life. It is the hardest problem of biology and it still has not been solved yet. Whereas the details of evolution can be described as a puzzle, the origin of life must be characterized as an intractable problem.

A crucial step (although not the first step) in the origin of life was the invention of DNA and the Genetic Code. DNA has become a defining property of life. But how DNA acquired its meaning is still a mystery. How could a relatively chemically inert non-enzymatic molecule (DNA) become useful, even indispensable, for life? DNA itself could not have been involved in the origin of life (Think about this...). There must have been something before DNA.

In the literature the Genetic Code is usually presented as a table in which all 64 combinations of 3 bases A, T, C, G are 'associated' with 20 Amino Acids. These associations could be made in many different ways. Nature got stuck to one system of associations called the Genetic Code table. In the book The Equations of Life: How Physics Shapes Evolution, Chapter 7 'The Code of Life', physicist Charles S. Cockell notices that the Genetic Code table is non-random. This is an important observation. The assignments of base triplets to Amino Acids (AAs) is non-random. When looking at the table It is immediately clear that there is a pattern. Secondly, there is redundancy: many Amino Acids are coded by more than one base triplet. But there is also a pattern. This is all known very well [4].

However, Cockell also notices that there is something special about nature's choice of the twenty Amino Acids. There are many more natural Amino Acids available than those twenty. So, why those twenty? Random? Accident? Or are they the most suitable for their task? He refers to a publication [1] that argues for a non-random choice. The authors reasoned that there are 3 properties of Amino Acids that are important for constructing a protein: (1) the size of an Amino Acid, (2) the charge, (3) hydrophobicity (repelling water) [3]. Together those properties determine how the protein behaves and what it can do. In principle proteins could be constructed from one or a few Amino Acids. But most useful proteins consist of a diverse mixture of amino acids. Proteins are defined by a unique sequence of AAs to fold into a complex 3D shape. Also, it is not useful to have many AAs with the same hydrophobicity, or the same size or the same charge. The best toolkit for life would have an even distribution of AA properties that does not overlap too much. To test for optimality the authors tested a set of fifty AAs found in the Murchison meteorite. The reason? They assumed that AAs found in the meteorite would represent the set of AAs found on the early Earth. What they found was astonishing, writes Cockell:

"When they compared the twenty amino acids used by life with a million alternative bundles of amino acids randomly chosen from the fifty in the meteorite, the twenty used by life had better coverage and combinations of all three of the key factors than did any other set. ... they seemed to be selected by evolution to give a wide range and even distribution of properties that might be useful in proteins." ... Of a much expanded set of seventy-six AA, not a single group out of a million possible alternatives outperformed the natural set.". (chapter 7).

Cockell concludes that the twenty AAs used by life are not random. That was new to me. But there is one important aspect Cockell doesn't mention: the AAs must also be suitable to be attached to a transfer RNA (tRNA) and to be processed by the ribosome. This is a crucial property. It is the biochemical implementation of the Genetic Code table. There could be differences in suitability. This must be investigated. Furthermore, all AAs are associated with 1 or more triplet codons (redundancy). The question is: how is the association made between a base triplet codon and the AA? And how did that originate in the first place? The structure of tRNAs does not show a direct chemical bond between codon and AA. Is it a random choice? That could certainly be the case because AAs and triplet codons (bases A, T, C, T) are different chemical compounds, yet they are somehow connected. Or is there a logic in the associations? Is there a pattern? Much research has been done to solve this question. No definitive answer yet. Cockell does mention this. But at least he pointed out a new aspect of the origin of life and the Genetic Code table to me.

______

In this blog I did focus on the origin of the Genetic Code. The origin of the Genetic Code is in fact the origin of DNA-based life: bacteria, animals and plants. It is also the origin of protein and enzyme based life. The origin of DNA and proteins are strongly intertwined. DNA on its own has no use and proteins can not exist without DNA. DNA cannot self-reproduce, it needs enzymes. But proteins cannot self-reproduce either. A specific protein consists of a unique sequence of Amino Acids. Unique proteins do not self-assemble spontaneously. The only way to reproduce such a unique sequence is on the basis of another unique sequence: the unique base sequence in DNA. In other words: DNA and proteins depend on each other. This not a promising situation to start life. Hence, the hypothetical RNA world was developed (which is not without its own problems!). Keep in mind: the origin of the Genetic Code is not the same as the origin of eukaryotes. Bacteria are also DNA- and protein-based life forms. All life on earth uses the same Genetic Code, including viruses.

Physicist Charles Cockell used the expression "DNA and its entourage" [2]. And that is a misleading description. I hope readers recognize this as DNA-centric thinking. The cell and the cellular machinery are not an "entourage"! It is an equally important part of the cell! DNA is not the master of the cell! The cell is not the servant of DNA!

Notes

Gayle, Freeland (2011) Did evolution select a nonrandom "alphabet" of amino acids? Astrobiology
"An entourage is a group of attendants, assistants, or close associates who accompany and work for an important or famous person".
There exist another list of properties of AAs: polar versus non-polar; acidic versus basic. [5 May 2026]
For example there is a famous article by Freeland and Hurst (1998) The Genetic Code is One in a Million, Journal of Molecular Evolution. See also their April 2004 article Evolution Encoded in Scientific American [5 Jun 2026]

Previous DNA blogs

If the blueprint of the embryo is not in DNA, then where is it? Alfonso Martinez Arias. A very convincing argument for the cell-centric view of life 14 March 2026
Richard Dawkins admits: DNA is *not* a blueprint! But Dawkins still got another metaphor wrong! 16 February 2026
Think about this: 09 February 2026
Gene-centrism is bad biology. Here is why. 17 December 2025
What is DNA-centrism? Why is it wrong? 10 November 2025

21 comments:

Evgenii RudnyiMonday, May 4, 2026 at 12:32:00 PM GMT+2
Have you seen a paper about the biological code:

shCherbak, V. I., & Makukov, M. A. (2013). The “Wow! signal” of the terrestrial genetic code. Icarus, 224(1), 228-242.
ReplyDelete
Replies
gert korthofMonday, May 4, 2026 at 4:29:00 PM GMT+2
Dear dr Evgenii Rudnyi, thanks for this publication. I can't remember I have seen it before. It is a very unusual publication. Do you think that our genetic code on earth is manipulated by aliens? I can't read the complete publication, do you know if they reveal what the cryptic message is? I really want to know what the message is.
- if our Genetic Code is truly universal (valid for the entire cosmos), due to universal natural laws, than it seems there is no room for manipulation (?).
- if our Genetic Code was manipulated by an alien intelligent civilization, then is their own Genetic Code also manipulated by yet another intelligent civilization? etc, etc. > infinite regress.
ReplyDelete
Replies
Rolie BarthTuesday, May 5, 2026 at 12:48:00 PM GMT+2
Gert, how interesting to read your blog about physics of life.
I read some research of Charles Cockell, he is a astrobiologist and physicist. From ancient geological data about the oxygenation of the atmosphere he calculated the power of UV B and UV C radiation. Around 1,500 million years ago a thick ozon layer was formed in a relatively short time. This resulted into a strong decrease of the harmful UV B and UV C radiation reaching the earth surface. This radiation level was decreased to one thousandth compared to the period before that, since the formation of the Earth (Cockell, 2000).
I think: this strong decrease opened the doors for multicellulair life on land (see my book, p. 315-316).

And now to my surprise you published this blog.
I am wondering: does this all (especially your critcs against the gene-centered view) change your ideas about neo-darwinism as explanation for the evolution of life?

Ref. Cockell, Charles S., ‘The ultraviolet history of the terrestrial planets – implications for biological
evolution’, Planetary and Space Science 48, 2000, p. 203-214,
ReplyDelete
Replies
AnonymousTuesday, May 5, 2026 at 4:34:00 PM GMT+2
dear dr Korthof

this might interest you to anticipate R Barth's question a little bit:

The structure of the SGC is nonrandom and ensures high robustness of the code to mutational and translational errors. However, this error minimization is most likely a by-product of the *primordial code expansion driven* by the *diversification of the repertoire of protein amino acids*, rather than a direct result of selection.

Origin and Evolution of the Universal Genetic Code Eugene V. Koonin1 and Artem S. Novozhilov2 Vol. 51:45-62 (Volume publication date November 2017) https://doi.org/10.1146/annurev-genet-120116-024713
ReplyDelete
Replies
gert korthofTuesday, May 5, 2026 at 4:59:00 PM GMT+2
Hi Rolie, thanks for your comment. I checked the pages in your book. I especially liked note 13, page 316, about UV protection by mycosporines and scytomenine in Cyanobacteria. They are photosynthesizers, so need to be close to the water surface to get enough sunlight. The associated disadvantage is UV damage to cell components. This explains the presence of mycosporines and scytomenines! Nice example of adaptation by natural selection! I wonder: if the first animals and plants would contain also these UV protectives, then they could withstand UV radiation and could have conquered the land long before a protective Ozonlayer was established...
Rolie: "to my surprise you published this blog": please note I have for a very long time a section 'Engineering, physics and evolution'
https://wasdarwinwrong.com/korthof.htm#engineering
in which I collect evolution books by physicists and engineers!

Rolie: "...especially your critics against the gene-centered view..."
My primary source of DNA-centrism and its failures was my years-long engagement with Senapathy's book. Furthermore, Nick Lane's OXYGEN has had a tremendous influence on my thinking: there is more to the evolution of life on earth than genes!

Cockell: So far I only read chapter 7: 'The Code of Life'. What he did in that chapter has little to do with physics, and far more with inorganic and organic chemistry. He seems to overlook the role of natural selection, and seems to explain the nonrandomness of the Genetic Code table mainly by physical forces, which I disagree with.
ReplyDelete
Replies
gert korthofWednesday, May 6, 2026 at 9:55:00 AM GMT+2
Dear Dr Anonymous, thanks for the Koonin 2017 publication.
Please note in the above publication The “Wow! signal” of the terrestrial genetic code, there is also a reference to a very similar Koonin publication:
'Origin and evolution of the genetic code: the universal enigma', 2009.
Please note I have blogged about the Koonin threshold for the origin of life:
Hoe Koonin het ontstaan van het leven verklaart: moedige poging of wanhoopsdaad? 2012.
Please note the lively discussion on this blog in the comments (among others a contributor you may know: harry pinxteren.)
and an updated version on my website:
The Koonin threshold for the Origin of Life on Earth, 2013.
ReplyDelete
Replies
AnonymousWednesday, May 6, 2026 at 10:30:00 PM GMT+2
Dear Dr Korthof

I skipped the 2009 (original 2008 https://doi.org/10.48550/arXiv.0807.4749 ) version because Koonin and Novozhilov dropped the * universal enigma* in their version of 2017, anticipating dr Barth’s question with their conclusion I quoted above.

I’ve only a vague idea what they could mean by *diversification of the repertoire* , but I think I have an interesting lead that contributes substantially to your new paradigm:

By identifying 14 distinct structural states of nucleosomes, “DNA spools”, Yang et al revealed a sophisticated “organizational code” that allows cells to fine-tune gene activity with incredible precision. Nature
DOI:10.1038/s41586-026-10418-6

The research counts as an AI Breakthrough: IDLI (Iteratively Defined Lengths of Inaccessibility) uses two-dimensional scanning, analyzing both the length of the DNA fiber and the internal structure of individual nucleosomes, to detect subtle distortions that previous technologies missed. Using a technology called SAMOSA to map DNA molecules, the AI was then trained to recognize patterns in the “accessibility” data. If a nucleosome was missing a building block or was loosely bound, the AI detected a specific “signature” of exposed DNA that shouldn’t be there in a “perfect” spool.

In short: AI Uncovers a Hidden “Grammar” in DNA Packaging that cells use to flexibly regulate gene expression

viz. IDLI showed some "interesting *diversification of the repertoire* of proteins, *rather than a direct result of selection*" , and thus substantiated the claims you made in the preceding blogs.

So, I 'd say we can undeniably detect progress, I think, at least since discussing Koonin, years ago.

As an instructive aside, please note:

IDLI uses the very same techniques that only a few days ago gave R Dawkins his own personal “Claude delusion”, when he had a couple of talks with his personalized chatbot ("Claudia"): “These intelligent beings are at least as competent as any evolved organism,” (sic) https://www.theguardian.com/technology/2026/may/05/richard-dawkins-ai-consciousness-anthropic-claude-openai-chatgpt
note that AIbots are particularly notorious for their sycophancy.
ReplyDelete
Replies
gert korthofThursday, May 7, 2026 at 10:48:00 AM GMT+2
Dear dr Anonymous,
thanks for your reply. I focus on:
"this error minimization is most likely a by-product "
Is it really only a 'byproduct' when it has beneficial effects?
Organisms with translation error minimization would have an advantage over those with less error minimization, isn't it? Then 'error minimization' is subject to natural selection, isn't it?. It is quite possible, even reasonable, that more than one feature is subject to natural selection.

I need more time to digest your further remarks.

I appreciate pointing out publications and data that support the anti DNA-centric view. For example, I read in Nature: "DNA sequences that control gene expression":
this is an ambiguous expression: do those sequences really 'control' gene expression or do they f.e. have predictable effects on gene expression? Are they just one factor in a complex system? Proteins are involved, proteins do the work. If X controls Y, it suggests X is the only cause of Y and there is a direct connection between X and Y.
So, if you or any reader encounters these expressions, I would be interested.
ReplyDelete
Replies
gert korthofFriday, May 8, 2026 at 9:32:00 AM GMT+2
Evgenii Rudnyi, thanks for your comments. I continue here on the bottom of the comments so readers easily see that these are the latest comments.
The Genetic Code table is represented in letters for the 4 bases and the 20 AA. How do you get from letters to numbers? How do you assign numbers to letters? If there are many ways to assign numbers to letters, how do you choose 'correct' one?
ReplyDelete
Replies
gert korthofSaturday, May 9, 2026 at 9:13:00 AM GMT+2
Fine tuning of the cosmological and physical constants: interesting news for readers interested in fine tuning:

Scientists make stunning discovery that could change our understanding of the Universe, Sciencedaily.com 8 April.

Constraints on fundamental physical constants from bio-friendly viscosity and diffusion, Science 23 Aug 2023.
ReplyDelete
Replies
AnonymousWednesday, May 13, 2026 at 5:21:00 PM GMT+2
Dear Dr Korthof

speaking of water - and fine tuning indeed

The findings challenge the traditional protein-only view of transcription, showing water is a conserved and critical component of the gene expression machinery.

https://today.ucsd.edu/story/water-molecules-found-to-actively-drive-gene-transcription-process
ReplyDelete
Replies
gert korthofThursday, May 14, 2026 at 8:58:00 AM GMT+2
Thanks dr Anonymous. Intriguing expression: ' ‘protein-centered’ view of gene expression'!
What I found interesting: "High-resolution imaging revealed hundreds of precisely positioned water molecules that stabilize the enzyme and help ensure accurate selection of genetic building blocks."
I don't see this as revolutionary, I see water as the assumed environment in which genes and proteins work. I will return to this when reviewing Adrian Woolfson latest book.
However, in the same article I read: "Strikingly, these waters are evolutionary conserved from bacteria to yeast" which is utter nonsense and incomprehensible gibberish (AI-generated?!)

ReplyDelete
Replies
AnonymousThursday, May 14, 2026 at 5:12:00 PM GMT+2
dear dr Korthof

may be the UCSD press release text was generated by AI
but the very title of the article in Molecular Cell mentions nothing less than "critical roles of water molecules in catalysis" - see also the "highlights" and the summary: "These findings provide unprecedented mechanistic insights into RNA Pol II catalysis and reveal vital and evolutionarily conserved roles of water molecules in transcription".

Sounds pretty different from "as the assumed environment" - to me.
ReplyDelete
Replies
gert korthofSaturday, May 16, 2026 at 9:11:00 AM GMT+2
Dear dr Anonymous, what I mean by 'the assumed environment of DNA' is everything that happens in a cell that is not directly or indirectly under control of DNA. Obviously behaviour of water is not under DNA control!
I realize now that even the expression 'the assumed environment' is misleading. It is very similar to Cockell's 'DNA and its entourage'!
It impies that there is a main component (DNA) and the rest is everything around it; the king and its servants. But in reality DNA and the cell depend on each other. Sounds familiar?
ReplyDelete
Replies
AnonymousSunday, May 17, 2026 at 5:34:00 PM GMT+2
Dear dr Korthof

As far as I know an estimated some 40 million proteines ( and counting) in each cell and a lot of other chemicals, including even watermolecules- and all not under controle of DNA. How are we going to model that?!
ReplyDelete
Replies

Add comment

Comments to posts >30 days old are being moderated.
Safari causes problems, please use Firefox or Chrome for adding comments.

Evolution blog

04 May 2026

Physicist Charles S. Cockell: "DNA and its entourage". The Genetic Code is non-random.

Notes

Further Reading

Previous DNA blogs

21 comments:

Total Pageviews