13 September 2023

The true history of junk DNA (2)

Francis Crick
What Mad Pursuit.
paperback 1988

"The originator of the central dogma, Francis Crick, was well aware of genes that didn't encode protein. They don't figure into the central dogma." (Laurence Moran (2023) What's in your genome, chapter 8 paragraph 'Revising the central Dogma?') (my bold)

Exactly: They don't figure in the Central Dogma! That is precisely the problem! Crick omitted noncoding DNA from the Central Dogma. Had he included it in his scheme, a lot of confusion could have been prevented.

Central Dogma, from: Francis Crick, What Mad Pursuit, page 168.

Crick could have added an arrow from RNA to for example 'RNA genes'. He did not.

RNA genes added to Central Dogma (©GK)

In this blog I want to explore possible reasons for this omission. They have to do with the historical scientific context of the time that Crick proposed his central dogma. I hope this will show that scientists misinterpreting the Central Dogma are not fools and that Crick himself overlooked non-coding DNA when drawing his Central Dogma diagram. But first a second quote from Laurence Moran:

"Many scientists have a very different view of the central dogma. They were taught, incorrectly, that the real meaning of the central dogma is that DNA makes RNA makes protein and the only function of DNA is to encode protein ... They were somehow led to believe that there was only one kind of gene, namely, protein-coding genes." (Moran, 2023, What's in your genome, Chapter 8) (my bold)

Well, it is certainly not a mystery why many scientists were led to believe that there was only one kind of gene: there is only one kind of gene in Crick's illustration of the Central Dogma.

Why did Crick not add RNA genes to his diagram? It is important trying to understand the historical context at the time that Crick proposed his Central Dogma. Traveling back in time is not easy, therefore I use Crick's own account in What Mad Pursuit.

The central problem of biology at the time was: How could genes possibly construct all the elaborate and beautifully controlled parts of living things? It was known that each chemical reaction in the cell was catalyzed by enzymes. This is a defining property of life on earth. Furthermore, it was known before 1953 that enzymes are proteins. Crick realized that the key problem in biology was to explain how proteins were synthesized. In the 1940s a very influential hypothesis was proposed, the 'One gene - one enzyme' hypothesis. The next question was: How do genes control the synthesis of proteins? (Chapter 3 The Baffling Problem, page 33). Obvious today, but at the time it was a problem at the frontiers of science. Further, it was also known at the time that proteins were made of about 20 different amino acids. 

After the discovery that DNA consisted of a sequence of bases, the next question emerged: what is the precise relation between genes and proteins? Crick proposed the Sequence hypothesis: the sequence of bases in DNA is a necessary and sufficient condition for the sequence of amino acids in proteins. Crick:

"Rereading it, I see that I did not express myself very precisely, since I said "...it assumes that the specificity of a piece of nucleic acid is expressed solely by the sequence of its bases, and that is sequence is a (simple) code for the amino acid sequence of a particular protein." This rather implies that all nucleic acid sequences must code for protein which is certainly not what I meant." (Francis Crick, What Mad Pursuit, Chapter 10, page 108).

Then Crick explains that other parts of the DNA sequence could be used for control mechanisms (today: gene regulation) and he even mentions producing RNA for purposes other than coding (today: RNA genes). Crick concluded: "I don't believe anyone noticed my slip, so little harm was done." (page 109). Unfortunately, Crick underestimated the long lasting influence of the famous Central Dogma diagram.

According to Moran the meaning of the Central Dogma diagram is that the information in proteins cannot get out again. That, indeed, is what Crick himself says (page 109). Unfortunately, the Central Dogma diagram is a weird way to illustrate the non-existence of a specific type of information flow. It is as if one wants to illustrate the absence of something with the absence of something in an illustration. It isn't manifest. It seems rather impossible to me to do that [2].

In my view the point of the Central Dogma was to illustrate (albeit in a partial way) the solution of the central question of the time and indeed of all times: how can genes specify proteins? Crick himself expressed this clearly:

"I shall… argue that the main function of the genetic material is to control (not necessarily directly) the synthesis of proteins." [1] (my bold)

The Sequence hypothesis isn't a hypothesis anymore, and it isn't at the frontiers of science anymore, but 'The Sequence' is still and will always be one of the defining characteristics of life on earth [3]. This is certainly not an outdated idea from the pas. Life as we know it is impossible without enzymes (=sequences) and without genes (=sequences) coding for them. 

The 'protein universe' is very much at the frontiers of science, new protein structures are discovered today [4].



Appendix (1)

All the following concepts are about protein synthesis:  
  1. Mendelian genes specify discrete phenotypic characters (with the benefit of hindsight).
  2. The Sequence Hypothesis states that the sequences of DNA bases specify the sequence of amino acids in proteins. (Chapter 10 Theory in Molecular Biology)
  3. The Central Dogma states the direction of flow of information from DNA to RNA to protein and not back from proteins (chapter 10)
  4. The Genetic Code Table specifies which 61 DNA base triplets which amino acid ('sense' codons) and 3 base triplets which specify STOP chain ('nonsense' codons) Chapter 8 and Appendix B.
  5. The Adaptor Hypothesis  (Crick, Chapter 8 page 95) (the implementation of the Genetic Code in specific molecules: tRNA) doesn't make sense without protein synthesis.


Appendix (2)

The terminology used to describe genes makes only sense (!) in relation to protein synthesis: 

  • sense, nonsense, missense
  • sense, anti-sense strand
  • positive-sense, negative-sense 
  • coding strand, template strand
  • coding, noncoding
  • translation
  • STOP/START codons
  • the Genetic Code Table
  • triplets
  • in-frame/out of frame
  • ORF: Open Reading Frame
  • mRNA: messenger DNA
  • tRNA: transfer RNA
  • rRNA: ribosomal RNA

Also, the concepts: promoter (DNA sequence to which proteins bind) and enhancer (DNA sequence to which specific proteins bind) make only sense (!) in the context of protein synthesis, direct or indirectly, because they promote or enhance gene expression of protein-coding genes (mainly). Using these concepts implies protein synthesis on the basis of DNA sequences.  

However, there are concepts not (directly) related to protein synthesis: base pairing, double helixtranscription, directionality, replication.

Appendix (3)

I wonder whether there is a total absence of any coding signature in non-coding RNA genes. I found it difficult to find clear information about it. For example: do START and STOP codons occur in RNA genes? If so, do they have any effect? Do non-coding RNA genes have a triplet structure? Do single base deletions or insertions have similar effects on RNA genes as on protein-coding genes? (they don't disturb the reading frame). Are there functional RNAs completely independent and unrelated to protein (synthesis)? How did RNA genes originate? Did those they originate from coding sequences or from random sequences?


  1. Matthew Cobb (2017) 60 years ago, Francis Crick changed the logic of biology, PLOS BIOLOGY. Please note "(not necessarily directly)", this is a very ingenious way of including the indirect way of controlling protein synthesis: via enhancers and promoters. (added 22 sep 2023).
  2. Elsewhere Crick designed another diagram which prevents the dilemma of illustrating the absence of something, see: Larry Moran (2007) Basic Concepts: The Central Dogma of Molecular Biology blog.
  3. See for example my review of Tibor Gánti (2003) 'The Principles of Life'. 
  4. ‘A Pandora’s box’: map of protein-structure families delights scientists,  Nature 13 Dec 2023.


Previous blogs

31 July 2023

The true history of junk DNA

"By the late 1960s, knowledgeable scientists were used to the idea that genes occupied only a small part of the genome, and in 1974 the editor of the journal Cell, Benjamin Lewin, was expressing the consensus view of the experts when he wrote that the C-value Paradox could be resolved by assuming that much of the genome is composed of nonfunctional repetitive DNA (junk DNA)." (Chapter 2 of Laurence Moran (2023) What's in your genome.)

It may be that 'knowledgeable scientists' in the late 1960s knew that much of the genome is composed of junk DNA, but the 'consensus view' was not widely known in all sub-disciplines of the biological research community. Maybe the journal Cell was not read in the evolutionary biologist community. Probably, those experts were experts in a different field with its own journals and conferences.

Eli C. Minkoff (1984) Evolutionary Biology.

I checked the oldest textbook I have, Minkoff (1984) Evolutionary biology. There is no 'junk DNA' and no 'non-coding' DNA in the index, despite the 'consensus view'. Yes, there are tRNA and rRNA (p.16), but these RNAs are not labeled as 'non-coding RNA' or 'non-coding DNA'. They are in the business of producing proteins. They are the very embodiment of the genetic code. Therefore, it would be somewhat counterintuitive to call them 'non-coding'. Yes, there is 'genetic drift', neutral mutations, 'neutralism versus selectionism', 'genetic load', 'mutational load' in his book, but Minkoff did not connect these concepts with 'non-coding DNA'. The concept is absent anyway. 'Centromere' is mentioned once casually (p.19), I could not find 'telomere'. Anyway, 'centromere' is not labeled as 'non-coding DNA'. Why is non-coding DNA absent from the book? 

I think I found part of the answer in the following passage:

"One of the fundamental tenets of modern synthetic theory of evolution is that natural selection operates on the phenotype rather than the genotype. No genetic change can be influenced by natural selection unless it first produces some phenotypic change. It is largely for this reason that modern evolutionary biologists must be aware of the manner in which phenotypes are controlled." (p.114)
This was an eye-opener for me. The phenotype is the most important, the genotype is important only in so far it has an effect on the phenotype. Who cares about DNA that does nothing? Evolutionary biology has the task of explaining the organism.

A second foundational paradigm I found here:

"Proteins are among the most important of all biological molecules. (...) The great intricacies of living systems are all the result of enzyme-controlled activities (...) enzymes are therefore the chemical basis of life". (p.17).

Taken together these two principles explain the mindset of evolutionary biologists in those days. If they did know about non-coding DNA, it simply had no relevance to the goals of their daily research. On page 37 there is a table labeled as 'The Genetic Code for Translation of mRNA Codons into Amino Acid sequences'. The famous table. It makes sense in this context, because the Genetic Code is the link between DNA and proteins. The reason for the existence of the Genetic Code is to produce proteins. The Watson-Crick structure of DNA plus the chemical structure of the four bases is present in the book. Minkoff knows the necessary biochemistry. Unfortunate exception: introns and splicing are absent! Introns were discovered in 1977.

There is one isolated and thus mysterious remark which vaguely suggests something like 'non-coding DNA':

"Not all of the genotype is transcribed and translated into a portion of the epigenotype, nor are all the transcribable genes ever transcribed at the same time." (p.114) ['epigenotype' = "the polypeptides that result from the immediate transcription and translation of the genotype"]

That's all. Probably, Minkoff was vaguely aware of non-coding DNA. But why include it in his textbook? He did not elaborate the concept because in his opinion it was simply not relevant or nothing was known about it. DNA which is not transcribed and translated has nothing to contribute to the phenotype of the organism, consequently nothing to biology and evolution. It doesn't fit in the evolutionary biology paradigm of that time [1].

So, that is the 'true history' of non-coding DNA based upon Minkoff (1984) and that was taught to biology students at that time. He did not say that non-coding equals junk, but by omitting non-coding DNA, he implied that non-coding DNA is unimportant. If one makes statements about the history of junk DNA, one has to investigate the evolutionary biology textbooks, especially older ones. Minkoff was an eye-opener for me. I checked more evolutionary biology textbooks: 8 out of 17 do not have 'non-coding DNA' in the index.

"As Sandwalk readers know, there was never a time when knowledgeable scientists said that all non-coding DNA was junk. They always knew that there was functional DNA outside of coding regions." (Sandwalk)

I think one has to take into account that there are different scientific disciplines with their own paradigms, leaders, journals, conferences, and networks. 

Thanks for reading. Have a nice day!



  1. On page 18 he writes: "there are other sequences in each DNA molecule that do not appear to determine the amino acid sequence of any polypeptide. Some of these may function as "spacers", and others are believed to function as regulatory genes, which control the transcription of other genes." (page 18, chapter 2: Basic Principles of Genetics). Here he describes non-coding regulatory genes! He doesn't realize that these non-protein-coding DNA sequences must have indirect effects on the phenotype, and consequently are important for evolutionary biology! In the subsequent development of evolutionary biology, the evolutionary importance of regulatory genes became evident.  Added: 21 Aug 2023


Previous posts

  1. Junk DNA in the Evolution textbooks (2) from 1996 to 2023 26 Jul 23
  2. Junk DNA in the evolution textbooks. Bergstrom and Dugatkin 2023 12 Jul 23
  3. Periannan Senapathy (1994) claimed that the human genome consists of more than 90% junk DNA. 4 Jul 2023
  4. Scientists say: 90% of your genome is junk. Have a nice day! Biochemist Laurence Moran defends junk DNA theory 26 Jun 23

26 July 2023

Junk DNA in the Evolution textbooks (2) from 1996 to 2023


    Futuyma, Kirkpatrick (2023)  Evolution.

fifth edition        

In the previous blog I discussed Bergstrom and Dugatkin Evolution, third and first edition. Today I continue my investigation of 'junk DNA' in the Evolution textbooks with a textbook by Douglas Futuyma and Mark Kirkpatrick (2023) Evolution also published this year

Although "junk DNA" does not occur in the index, on page 86 (chapter ' Mutation and variation'), the authors state: "In humans for example, 98% of the DNA does not code for any gene product." Note: they do not say 'protein product', but 'gene product'. However, the capture of figure 10.13 states: "Less than 2% is devoted to protein-coding sequences." (p.274). So, if that is what they mean by 'gene product' the 98% is OK. They explain these matters in an excellent and up-to-date chapter about genes and genomes (chapter 10). On page 281 the authors ask:

"Does that mean the 98% of our genome that is noncoding is actually junk? We are still far from having a clear answer to this fundamental question. (...) some of the resulting "junk" now plays key roles in regulating gene expression, and the cell's metabolism has coevolved with the total quantity of DNA in the nucleus. (...) Like an addict and his drug, eukaryotes may not be able to break their dependence on a bloated genome. But their is good news in this story. When ancient eukaryotes acquired large amounts of of noncoding DNA, it opened new options for the evolution of gene regulation. That, in turn, may have enables the origin of complex life-forms, including ourselves." (p.281)

In the end-of-chapter section called 'What We Don't Know' (by the way, a nice feature!):

"Also debated is the fraction of the eukaryotic genome that has a function. One large-scale study estimated that 80% of the human genome has a function." [the reference is the ENCODE publication in Nature, 2012]. "That estimate, however, has been criticized as far too high, and many evolutionary biologists would agree that perhaps only about 10% of our genome has a definite function" (p.281).

Figure 10.13. (p.274). Adapted from Gregory (2005)

This seems a fair and correct description of the status of the scientific evidence. What I miss in figure 10.13 is the difference between functional and non-functional DNA (perhaps an unreasonable demand!). The functional sequences outside protein coding sequences will be hidden in the 98%. 'Functional RNA' is not present in the book. One can find a comprehensive treatment of functional and nonfunctional RNA in Moran (2023) (see my blog 26 June).

"Alternative splicing is a major mechanism used by eukaryotes to increase organismal complexity " (p.274). 

Is it really 'a major mechanism'? This is a controversial statement because there is no quantitative estimate of its importance. Futuyma and Kirkpatrick do not mention non-coding tRNA (transfer RNA), but ribosomal RNA (rRNA) is present (p.269).  However, both are not introduced as good examples of non-coding DNA (they do not code for proteins but are functional). 

Note [3] about Futuyma, 1st edition, 1979


Nicholas Barton et al (2007) Evolution 






Nicholas Barton et al (2007) Evolution has a very good discussion of junk DNA, selfish DNA, C-value, non-coding DNA. Fortunately, they also pay attention to the disadvantages of a big genome (p.597). For example, in insects metamorphosis requires rapid cell division and is harder when massive amounts of DNA must be replicated. It shows that natural selection can downsize large genomes. Which is good to know! It brings the burden of large genomes back into focus. This is important:  transposons can by accident acquire a new function. They do not give an estimate how often this happens. "Introns are frequently considered to be junk DNA. However, comparative sequence analysis has revealed that the sequence of some introns is highly conserved, suggesting that functional constraints have played a role in evolution." (p.220). "Overall, about 18% of nucleotides are conserved in introns and intergenic regions, compared with 72% within exons" (546). "Such studies suggest that in that in multicellular eukaryotes, at least as much non-coding as coding sequence is maintained by selection." (p.547). If all possible alternative splicing possibilities are taken into account, species such as humans can make millions of different proteins, even though they each have only 25,000 protein-coding genes. Alternative splicing provides a significant source of novelty for diversification" (p.221). This is regarded as controversial by some. Transposable elements have sometimes been co-opted to aid their host (p.598). Sometimes pseudogenes acquire new functions. These views contrast with those of Laurence Moran (2023). Further research is necessary.



  Freeman, Herron (2007) Evolutionary Analysis





The most recent edition of Freeman and Herron is the fifth edition (2013). I don't have that edition, I used Evolutionary Analysis 4th edition (2007). There is no 'junk DNA' and no 'non-coding DNA' in the index. Unexpectedly and paradoxically, transposons –a prime example of selfish genetic elements– are discussed in chapter 15 'Phylogenomics and the molecular basis of ADAPTATION'. 

But first, read this stunning remark (remember, this book was published before ENCODE 2012):

"In humans only about 1.2% of the genome codes for proteins." (p.576)

They do not comment on this remarkable statement. It is an isolated statement from an unknown source. However, they do state that the "extra" DNA responsible for the C-value paradox consists of transposable elements: "In the human genome, for example, over 44% of the DNA present is derived from transposable elements (p.576). What about the remaining 54%? They do not tell. Unknown?

Fortunately, the authors discuss the burden of these genomic parasites. It costs the cell time, energy and resources to replicate a genome with a lot of transposons (p.577). Funny remark: transposons are present in the genome in "often appallingly large numbers"! Such an emotional remark is really funny for a textbook! Good to know: transposons are not 100% non-coding DNA, because they encode the enzyme transposase. They have further important information: defense mechanisms against transposons (!), and: "work by John Moran (!) (1999) suggested that transposition events in eukaryotes may occasionally result in mutations that confer a fitness benefit." (p.581-583). They conclude:

"Even though most transposable elements function as genomic parasites and most transposition events result in deleterious mutations, it is increasingly clear that at least some transposition events result in important new genes or other changes that have a positive impact on the fitness of organisms." (p.584).

My conclusion: there is no 'junk DNA' and no 'non-coding DNA' in the index, and more puzzling, there is also no discussion of introns and splicing. That is a serious omission for an evolution textbook. The origin of introns is a longstanding evolutionary mystery. However, they have interesting things to say about transposable elements. According to Laurence Moran 37% of our genome consists of introns and according to Futuyma, Kirkpatrick (2023): 26% (see figure above). Obviously, without introns Freeman and Herron don't have a complete overview of non-coding DNA and can't calculate the sum total of functionless DNA in our genome. Yet, they know that 1.2% of the genome codes for proteins! I guess that Freeman and Herron are optimistic about the possibility of finding more useful elements in the uncharted parts of the human genome and therefore avoid the concept 'junk DNA'. Reasonable.


Strickberger's Evolution Fourth edition 2008


Strickberger's Evolution is a famous evolution textbook. The first edition was published in 1990. The fifth edition appeared in 2013. The most recent edition I have is the fourth edition (2008) authored by Brian Hall and Benedikt Hallgrimsson (I don't know whether Strickberger participated in this edition). 'Junk DNA' [1] and 'non-coding DNA' are not in the index. However, 'junk DNA', 'selfish DNA' and 'C-value paradox' are discussed in the text.

"According to some molecular biologists, many transposable elements and other forms of repeated sequences contribute little, if any, function to their host cells. Because the DNA replication process cannot discriminated between functional and nonfunctional sequences, it replicates any introduced sequence. Transposon DNA and repeated sequences may therefore perpetuate parasitically as either "junk" or "selfish" DNA. (p.221).

Important information is present in Box 12.1 'Quantitative DNA measurements'. In connection with the ENCODE project the following paragraph contains intriguing thoughts which I quote in full:

"According to Bird [1995], eukaryotes were able to circumvent such "noise" by a nuclear membrane that separates transcription from protein translation, allowing only translatable messenger RNA sequences to filter into the cytoplasm, and by tightly folding the DNA of functionally unnecessary genes into nontranscribable confirmations, using nucleosomes and their histones. To these transcription-repressing mechanisms, Bird claims that vertebrates added DNA cytosine methylation, formerly used mostly to suppress genomic parasites such as transposons." (p.260). (my bold)

Especially the concept transcription-repressing is intriguing because according to the ENCODE project and Laurence Moran there is pervasive transcription in the cell and most of it is noise! This would disprove the success of transcription-repressing mechanisms? Apparently, the mechanism fails spectacularly. 

My own thoughts are that maybe because transcription is restricted to the nucleus and those RNA transcripts are not exported to the cytoplasm, and consequently are not translated, large-scale transcription can be tolerated by the cell. This assumes that protein synthesis is more costly than transcription. I admit that it is still a burden, but the burden has been halved.

In contrast to Freeman, Herron (2007), in this book 'introns', the "Introns early - Introns late-hypothesis" and alternative splicing are present. Interestingly, they describe introns as mobile DNA sequences that can splice themselves out, acting like transposon-like elements.

Brian Hall and Benedikt Hallgrimsson do not favor the concept 'junk DNA'. It is obvious from this remark: "various biologists have been tempted to consider some or many such sequences as forms of "selfish DNA'." (p.262).



Stephen Stearns, Rolf Hoekstra (2005) Evolution, an introduction, second edition, paperback.

Relevant topics are 'jumping genes', 'transposons', 'introns', 'B-chromosomes'. Not found in other textbooks: B-chromosomes are not transcribed and do not contain information vital to the organism, they are genomic parasites (p.362). This fits the definition of junk DNA, although Stearns and Hoekstra do not use the concept. They make an interesting remark about transposons: several mechanisms have evolved to suppress the deleterious effects of active transposons (p.363). I would like to know more about them! "In humans they [transposons] may account for 45% of the genome". "Transposons illustrate genomic conflict between selection favoring mutants that increase the replication rate of transposons and selection favoring the suppression of transposons through stronger replication control." (p.363). They have a chapter about Genomic Conflict. There is no new edition of this textbook.

Mark Ridley (2004) Evolution, 3rd Edition

Non-coding DNA is listed in the index under 'DNA, non-coding' [2]. One relevant paragraph 2.4: 'Large amounts of non-coding DNA exist in some species'. The human genome contains 5% maybe up to 10% of genes. "The function of non-coding DNA is uncertain. Some biologists argue that it has no function and refer to it as "junk DNA". Others argue that it has structural or regulatory functions." "Most non-coding DNA is repetitive." (p.27). Alternative splicing is mentioned (gene slo), but there is no diagram of exon-intron structure of a gene (!). The existence of genes coding for RNA (rRNA, tRNA) is mentioned in a footnote ("some genes code for RNA" ! p.25). Further information in Chapter 19 'Evolutionary Genomics' is about the evolutionary history of transposable elements ("About 45% of the human genome is derived from transposable elements", p.567). I am a little disappointed, I expected more of Ridley. Please note, that the draft Human Genome sequence was published in 2001. No definitive conclusions possible at that time. There is no new edition.

Finally (for now...), John Archibald (2018) 'Genomics: A Very Short Introduction', Oxford University Press, 135 pages, has a succinct description of the ENCODE project in the paragraph "Jumping genes and 'junk' DNA" (p.50-53): "The ENCODE project's broadest and most controversial claim is that 80 per cent or more of the human genome has a biochemical function. "

Peter Skelton 'EVOLUTION. A biological and palaeontological approach' (1993, 1994, 1996)


[ added 29 July ]



I ignored this book because I did not expect it would contain junk DNA. Surprise. In chapter 3: Heredity and Variation, the C-value paradox is explained and illustrated with the well-known genome size diagram of various groups of organisms (Fig. 3.5). It immediately stands out that all salamanders and lungfish have bigger genomes than all mammals, birds and reptiles. The C-value paradox is explained by differences in the amount of various repetitive sequences, and polyploidy. Bats and birds have a high metabolic rate and their genome size is lower than other mammals (p.84). ln 1977 introns were discovered. There is a diagram (Fig. 3.9) of gene structure (intron-exon structure). This figure shows introns with smaller sizes than exons. Unfortunately, students get the false impression that these are the right proportions. However, introns are generally larger than exons.  Alternative splicing is described (p.90). Transposons are explained (p.92). Conclusion: despite this book appeared before the publication of the human genome in 2001, the ingredients of 'junk DNA' are present. So, it doesn't matter that the word 'junk DNA' isn't used.


Conclusion: The word 'junk DNA' is absent in the index and in the text of 4 of the 9 textbooks I investigated. However, if 'junk DNA' is not in the index of a textbook, it always pays to search for transposons, introns, jumping genes, selfish DNA, pseudogenes or C-value paradox. This review of evolution textbooks is not exhaustive (I could not check all editions of all textbooks). Those listed here are the most interesting and give sometimes additional useful insights and different points of view. Some are pre-2012, some post-2012, but all except Skelton are post-2001. In general authors know that less than 2% of our DNA codes for proteins, but are not sure about the rest. I agree. Nobody can claim to know exactly how much of our genome is useless junk. Even assuming that 90% of our genome is junk, there is no solid answer to the question why is there so much junk in our genome, and why it hasn't been eliminated.


Not discussed here is John Parrington (2017) 'The Deeper Genome. Why there is more to the human genome than meets the eye'  (OUP paperback). This is a must read. He gives very interesting examples of beneficial non-coding DNA derived from transposons, and has a point of view other than that of Moran (2023). I hope to blog about it in the future.

Thank you for reading!


  1. Later I found 'junk DNA' listed under 'Deoxyribonucleic acid' - "junk DNA" in the index! [30 Jul 23]
  2. Non-coding DNA is listed in the index under 'DNA, non-coding' [30 Jul 23]
  3. Futuyma (1979) Evolutonary Biology (1st edition). page 439: "Nonetheless, there is an enormous amount of redundancy in the genome, and its significance is obscure. In may animals as much as 60% of the genome seems to consists of short (less than 300 nucleotide pairs) repeated sequences, some present in thousands of, even a milion, copies". That's all. (personal communication Gerdien de Jong) [1 Aug 23]