20 June 2019

DNA is the most beautiful molecule in the universe and Gareth Williams 'Unravelling the Double Helix' made it even more interesting

Gareth Williams (2019)
Unravelling the Double Helix. The Lost Heroes of DNA

James Watson and Francis Crick, 1953


Suddenly I realized that the famous picture of Watson and Crick standing before their 3D model of DNA, subtly suggests that they built their double helix model from scratch. Well, yes, they did build the physical model. But, I did not realize that, long before Watson and Crick, researchers established the building blocks of DNA: P, ribose and the 4 bases A, T, C, G. 

So Watson and Crick discovered neither the building blocks nor the structure of the building blocks. Amazingly, they did not even do any lab work on DNA. And at the time when they started thinking seriously about the structure of DNA, it had already been established that DNA, and not protein, is the carrier of heredity. So, they did not come up with that idea either. They copied the DNA building blocks from the scientific literature of their time. That means others have elucidated the components of DNA.

I did not know (or forgot) that 40 years before Watson and Crick, a researcher called Phoebus Levene established the structure of the sugar and the 4 bases in DNA in the course of  decades of research and put it together in a model of DNA:

Figuur 7.2 in Gareth Williams (2019)
DNA according to Phoebus Levene.


The connections between P – Ribose – Base are correct. And Levene introduced the word 'nucleotide', which is still in use today. Yes, it was single-stranded DNA [11]. Poor Levene was written out of history [8], [9] because he came to believe that the bases A, C, T, G were present in equal proportions (tetranucleotide hypothesis). Therefore it was difficult to see how DNA could contain sufficient genetic information.

In the famous textbook  Molecular Biology of The Gene  (James D. Watson, et al) Levene is not even mentioned. Yes, Levene's model was not a double helix, but he got the building blocks and the links right. This puts the achievements of Watson and Crick in the right historical context. They have become famous for putting the last pieces of the puzzle together and got the Nobel price for it. But without the work of Levene they could never have done it.

I discovered this thanks to medical historian Gareth Williams. In his book Unravelling the Double Helix, Gareth Williams shows that 85 year of DNA research by hundreds of scientists preceded Watson and Crick's discovery. That is the period 1868 up to 1953. His book is very entertaining and easy-going with special attention to the characters and personalities involved including conflicts between scientists, their prejudices, failures and successes.

Levene dogmatic?

I think the most interesting and controversial researcher was Phoebus Levene. Levene produced a wrong structure of the DNA model (tetranucleotide) and believed that nucleic acids have nothing to do with heredity [12]. He is supposed to have defended his (wrong) ideas at all costs, and has been accused of obstructing the progress of the science of DNA. On its own it should not be a disaster to publish a wrong hypothesis [13]. The famous Linus Pauling published a faulty DNA model. And the first physical model of DNA created (but not published) by Watson and Crick was hopelessly wrong. This kind of mistakes are usually corrected by subsequent research. But Levene's hypothesis was not corrected quickly. Prof. Levene was an authority. He had great technical skills. Not many others had the skills to repeat his experiments and refute him. According to Williams he ignored criticism and inconvenient facts. As a consequence research into DNA came to a halt. DNA was a boring molecule. Therefore, scientists focused on proteins as the carrier of heredity [12]. That is tragic indeed. But, it is highly unfair to judge Levene after 1953 with the correct DNA structure at hand. The question is: was he unreasonable given the data at the time? Does Williams think that Levene is one of the 'lost heroes of DNA'?

Levene in Wikipedia

Was Levene wrong in all respects? The DNA structure in figure 7.2 (above) has its merits. The structure is linear and not closed. It can be extended indefinitely in both directions, which is a promising feature for information storage. However, wikipedia shows a different structure [1]:

Tetranucleotide in wikipedia

This is the Tetranucleotide according to wikipedia. It is a visually powerful image with a clear message [3]. It clearly is a closed molecule containing each of the 4 bases A, T, C, G only once. It has no possibilities for extensions. This model is not in Williams' book, but he informed me that it originates from H. Takahashi (1932) Uber fermentative Phosphorierung der Nucleinsaure. So, wikipedia attributes the figure to Levene incorrectly [4].

The problem is that we are not sure what Levene really thought. There is evidence that Levene thought DNA has a closed structure or consists of a stack of closed structures. But there is also evidence that he thought DNA was linear [2]. As far as I can see, Levene never published the closed tetranucleotide structure himself and the models he did publish are linear [5]. Does that tell us something? This requires investigation of the primary sources, his own publications, not what others tell about him. Finally, why would he spend so much time on a boring molecule?

With the benefit of hindsight

With the benefit of hindsight, the linear model, but even the cyclic model had more potential than Levene had realized. The linear model has the potential to be extended in both directions infinitely because it is an open structure. Allowing for any order of the bases would fulfill the requirements for information storage. OK, it is not a double helix, but that does not affect information storage capacity [11].

In Williams' figure 7.2 (above) the bases T, A, C, G are present, but in figure 4.8 and 4.9 of Portugal and Cohen (1979) only pyrimidine (T or C) and purine (A or G) are present. I guess that the illustrations of Portugal and Cohen are close to the original. So, that suggests Levene had no fixed order of the 4 bases in mind. In my view Levene could not have evidence that there is only one fixed order of the bases in his tetranucleotide or linear model. Then (if he did not impose a purine - pyrimidine alternation), 6 possible tetranucleotides are possible:

6 different closed tetranucleotides (simplified) ©GK

Because of the cyclic nature, no more than 6 different tetranucleotides are possible. Six different tetranucleotides is slightly less boring than one, but by far not enough. But maybe this kind of thinking could trigger further thoughts. For example, even with one cyclic tetranucleotide ACGT with fixed order of the bases, one could stack it while turning the next one 45° or 90° or 135° clockwise. A reading mechanism could read bases along one (or even all 4) vertical axis, resulting in unlimited information storage capacity. So, the tetranucleotide on its own need not restrict storage capacity. In fact, pure logic makes this possible.

What could cast doubt on the strict tetranucleotide hypothesis is experimental data suggesting unequal amounts of A, T, C, G in DNA samples (A:T:C:G = 1:1:1:1). At the time scientists accepted that the 4 bases occur in equal amounts in DNA. However, do equal amounts of A, T, C, G really kill information storage capacity? No, because A, T, C, G could be distributed in many different ways in a linear single-stranded sequence. For example these two 8-base-sequences



have the same number of the four bases: 2xA, 2xT, 2xC, 2xG, but they are obviously different sequences. Why didn't this occur to people (Levene) at the time? Could it be that they simply did not have the concepts of information storage and digital coding like we do nowadays with computers in every pocket? [6]. If so, the problem was a conceptual problem. In that case Levene was not dogmatic, but a general conceptual barrier prevented Levene and others to solve the DNA puzzle. Maybe another conceptual barrier was the idea of a helix or the idea of a double helix because these concepts were not yet invented in chemistry? [7]. Today we are used to the idea that DNA is a very long molecule. In fact the longest molecule in the universe. The shortest human chromosome is 46 million and the longest is 248 million base pairs [14]. This is far beyond the imagination in those days, I guess.
A technical problem in those days was the fact that, after analysis of the macromolecule DNA many breakdown products were to be found in the test-tube. There are many ways to reconstruct DNA from the breakdown products. That's the problem.

The most beautiful molecule

All these possible barriers make the history of the discovery of DNA fascinating. Although that history has been written at length by others, including Watson, Crick and Wilkins [8], and I read them years ago, Gareth Williams managed to renew my interest in the subject. I started rereading the older books because I had questions and wanted answers.

Scientists of DNA timeline (source).

I found a time-line of the discovery of DNA on the internet. Amazingly, the period between Miescher (1869) and Chargaff (1950) is completely blanc! As if nothing had been done. Gareth Williams did a great job filling in that empty space.

But Gareth Williams did more than that. For example, for me Miescher was just a name. But  Friedrich Miescher did his discovery at a very young age and at first his boss was not interested in publishing it. Amazingly, Miescher did not believe nucleic acid had anything to do with heredity! He preferred to see nucleic acid as a storage for phosphorus! This sort of stories make what was only the name of an author into a real person.

Another example is the famous Thomas Hunt Morgan, who at first was unimpressed by Mendelian genetics and chromosomes, but "in 1915 he publicly purged himself of his earlier sins in The Mechanism of Mendelian Heredity". It is travelling back in time. Lessons to learn when reading the scientific journals of today. 
The most amazing aspect of the history of DNA is that researchers investigating DNA had no clue that DNA equals heredity.

DNA is by far the most beautiful, oldest and most informative molecule in the universe and Gareth Williams made it even more interesting. Such a molecule deserves another book. Williams wrote a friendly introduction into the history of that molecule. The merit of Williams' book is that he is not only telling the hard scientific facts, but also the personal and conceptual struggles of the scientists involved in such a way that one is intrigued and wants to know more. 




  1. Tetra-nucleotide hypothesis is very shortly discussed in The Eight Day of Creation, but a whole chapter is dedicated to the Tetranucleotide hypothesis in Portugal and Cohen; and in Olby. But I must have forgotten it completely.
  2. Portugal and Cohen in A Century of DNA wrote "Thus, in 1938, Levene and Schmidt were able to measure  the molecular weight of native DNA at between 200,000 and 1 million. (88). And: Levene "proposed a linear combination of nucleotides with the correct internucleotide linkage." (p.89).
  3. The publication of Takahashi is in the list publication of GW, but he does not include the Takahashi figure in his book.
  4. However the wikipedia figure must be a modern reconstruction of Takahashi figure. This must be because in F.H. Portugal, J.S. Cohen (1979) figure 4.10 shows the tetranucleotide in a different form, and the bases are indicated with 'pu' (purine) and 'py' (pyrimidine) and not with A, C, T, G.
  5. Williams informs me that "Levene never dreamed up a cyclic structure" (personal communication 17 June 2019). I had formed the view that Levene was dogmatic about the tetranucleotide and therefore obstructed the progress of science.
  6. After I wrote that, I found in Olby The Path to the Double Helix: "The number of Bausteine which can take part in the formation of the proteins is about as large as the number of letters in the alphabet."  which was written by Kossel in 1911. So, the alphabet was used as a metaphor for information in proteins. Maybe this was not a general view? But there seems to be no conceptual barrier viewing DNA as a carrier of information.
  7. The tetranucleotide hypothesis and especially the single-stranded DNA hypothesis prevented Chargaff to correctly interpret his own data. He did not try a double-stranded structure. Apparently, that is a huge leap.
  8. Maurice Wilkins (2003) The Third Man of the Double Helix, wrote no more than one sentence about Phoebus Levene: " ... analytical chemist Phoebus Levene, who had argued for decades that DNA could not be gene material because it only contained four bases, which would, he claimed, make it far too simple a compound to contain genetic information." (p.152). That is really disappointing. What a terrible thing to say about the man who discovered the D-ribose sugar and the nucleotides and nucleosides in DNA! Even worse: Wilkins wrote Levene had a negative influence on Chargaff!
  9. In Francis Crick (1988) What Mad Pursuit, "Phoebus Levene, the leading expert on nucleic acid in the 1930s, had proposed that they had a regular repeating structure [the so-called tetranucleotide hypothesis]. This hardly suggested that they could easily carry genetic information. (p33-34). But it is highly unfair to condemn Levene after 1953! What is needed the data available at his time. And therefore actual publications of that time should be investigated.
  10. James D. Watson (1982) The Double Helix. Penguin Books. Nothing about the history of DNA research! DNA has no history!
  11. Single-stranded DNA does not occur in bacteria, plants and animals, but there exist single-stranded DNA viruses. RNA is single-stranded.
  12. Levene said in 1916 about nuclei acids (DNA): "“They are indispensable for life, but carry no individuality, no specificity, and it may be just to accept the conclusion of the biologist that they do not determine species specificity, nor are they carriers of the Mendelian characters” (source). This is probably a far more serious error than publishing a wrong model of DNA. [added 21 Jun 2019 ]
  13. This what Andreas Wagner said in an interview: "Failure is key to success, and it should be embraced as a necessary part of the creative process. “If we are honest with ourselves, we understand that we are failing more often than we are succeeding, and that is a very Darwinian concept,” ..." source [ added 22 Jun 2019 ]
  14. See: Human genome in wikipedia. The smallest human chromosome is chromosome #21, the biggest is #1. The total length of the human genome is over 3 billion base pairs. [ added 10 Jul 2019 ]