26 June 2023

Scientists say: 90% of your genome is junk. Have a nice day! Biochemist Laurence Moran defends junk DNA theory.

Larry Moran (2023)
'What's in Your Genome?
90% of Your Genome Is Junk'

"The scientist says: science has explained many things about the universe. Your life has no meaning. Have a nice day." [1]

Biochemist Larry Moran brings more bad news: 90% of your genome has no meaning! Even worse: the remaining 10% is not a finely tuned Swiss watch but prone to mistakes and errors! Have a nice day! 

This is really not good for your self-esteem: The Deflated Ego Problem, as Moran calls it. He himself doesn't seem to be worried at all. He, like many molecular biochemists and geneticists, had many years to get used to the idea and trying to understand it. They are comfortable with the idea that large genomes are full of junk DNA. Moran eliminated God and natural selection for 90%. Read the book and decide for yourself if he is right. This book is not the usual popular science book. Moran is not a science journalist. The book has a high density of facts and arguments and no anecdotes. It is a book length and systematic defense of the junk DNA view. Whether you agree or not with Moran, after reading the book your view of the human genome (and some other things) will never be the same again.

In 2012 a large international group of scientists, called the ENCODE consortium, claims that 80% of our genome has some biochemical function. Junk is still there, but they reduced it to 20%. Top science journals Nature and Science published their results and opinions and seem to agree. Moran strongly disagrees. He argues that long before scientists were able to sequence the first whole genome (that is before 2000) there was evidence that much of our genome simply must be junk. I was not aware of some of those facts (or have forgotten it?). Amazingly, the ENCODE scientists didn't know those facts either and accused the 'junk DNA scientists' of having based their claims on ignorance because you cannot assume that all that DNA is junk without a thorough investigation. But according to Moran, the opposite is true: the anti-junk scientists are ignorant of a lot of evidence. Clearly it is necessary to know the history of the field. Moran explains that history. Read it.

 

Natural selection

Evolutionary biology was forced to adapt to a constant stream of new discoveries in biochemistry and genomics. The author of What's in Your Genome? rejects the Darwinian view that every property of an organism must have a function and must have been selected by natural selection (a view called adaptionism or pan-adaptionism). Remarkably, he follows paleontologist S. J. Gould in his criticism of pan-adaptionism. Moran has the rather extreme view that "most of evolution has nothing to do with natural selection". I hope he meant to say: "most of evolution at the molecular level has nothing to do with natural selection." But, I am not sure. In defense of Darwin, I must say that Darwin was concerned with organisms and the goal of his theory was to explain adaptations of organisms. In Darwin's time there was next to nothing known about genetics, let alone of DNA. So, Darwinism explains features of organisms (beaks, claws, eyes, wings, brains) as adaptations created by natural selection. Potential readers of What's in Your Genome? must be aware that the book is not about the precious 10% of our DNA that makes us human. It is not surprising that a biochemist focuses on non-adaptive features of organisms. Especially a biochemist who defends that 90% of your DNA is non-functional. There is so much noise in biochemical reactions, they are very messy and there is so much needless complexity at the genomic and cellular level.

Again unsurprisingly, Moran has sympathy for the idea that mutation is a prime mover in  evolution, an idea called 'mutationism' and argued by Masatoshi Nei [3]. It is reasonable to claim that the 90% junk is explained by mutation and random drift and further that natural selection is powerless to remove the junk. However, it is unreasonable to extrapolate that to a general rule in evolution and call it mutationism. Let me illustrate that by a thought experiment of Masatoshi Nei approvingly quoted by Moran:

"Nei emphasizes the importance of mutation by asking us to consider a parallel universe - one with natural selection but no mutation and one with mutations but no natural selection."

In the first universe there will be no evolution, in the second there will still be evolution since new genes and new variants will appear and their frequency in the populations can increase by random genetic drift. This is Nei's explanation. But it is nonsense. It is a false dilemma. It is just as silly as asking which of the legs of a table is the most important. Evolution needs both mutation and natural selection. Secondly, Nei simply assumes organisms with functional genes at the start of his thought experiment. The thought experiment will not work if there are no organisms in the first place. How did those organisms originate? By random genetic drift? Genomes will mutate to extinction if the fitter genomes will not be amplified relative to the less fitter genomes. Any existing life form is proof that natural selection has created functional genes. In both hypothetical universes there would be no living organisms to start with. Conclusion: mutationism cannot possibly be a general theory of evolution. Despite his endorsement of mutationism, Moran knows that natural selection is indispensable as an important indication of function of a DNA or RNA sequence. The fact that specific sequences are conserved between species shows they are functional and must have been selected. Moran knows that.

 

Is junk DNA a burden?

Genome size distribution
 Figure 2.1
Chapter 2.
Note: birds (green) have smaller genomes than mammals (red)

A valid question is: why didn't natural selection remove most of the junk from our genomes? The amount of junk seems disproportionate. The standard explanation (also adopted by Moran) is: 'small population size'. Species with large population sizes have less junk (bacteria). Animals with smaller populations sizes have lots of junk. Is that a universal law? Is it the correct explanation? But what about birds? "Avian genomes are small and streamlined compared with those of other amniotes by virtue of having fewer repetitive elements and less non-coding DNA" [4]. Is that because they have larger population sizes or could it be that the energetic burden of large amounts of junk is too much for birds? Moran writes: "The pufferfish genome is only one-eighth the size of our genome, and the lungfish genome is 40 times larger than the human genome". Can this be explained by different population sizes? Are effective population sizes known? Are there 8 times as many pufferfishes than humans? Prokaryotic genomes range from about 500 kb to about 12 Mb [8]: does that correlate with population sizes? 

The existence of huge amounts of useless DNA conflicts with the evolutionary principle "use it or lose it." (cavefish lost their eyes). Useless DNA is copied and repaired by a complex repair system at every cell division. "Most of our genome is transcribed"  also adds to the burden. These spurious transcripts must be removed by RNA quality control, surveillance and degradation machinery. Again, this adds to the burden. Introns have to be degraded. "Incorrectly spliced RNAs are rapidly degraded by enzymes that clean up mistakes": this adds to the burden as well. Furthermore, "Some pseudogenes will be transcribed and still produce a protein. This does not mean they have a biological function." Again this adds to the burden. I wished Moran had discussed these questions. Too easily he accepts the population size argument and doesn't take the burden problem serious. To me it seems appropriate to discuss the burden problem in a book length defense of junk DNA [12].


Null hypothesis or null dogma?

It's clear this is a polemical book. It is a very forceful criticism of ENCODE and everyone who uncritically accepts and spreads their views including Nature and Science. I agree that this criticism is necessary. However, there is a downside. Moran writes that the ENCODE research goals of documenting all transcripts in the human genome was a waste of money. Only a relatively small group of transcripts have a proven biological function ("only 1000 lncRNAs out of 60,000 were conserved in mammals"; "the number with a proven function is less than 500 in humans"; "The correct null hypothesis is that these long noncoding RNAs are examples of noisy transcription", or junk RNA"). Furthermore, Moran also thinks it is a waste of time and money to identify the functions of the thousands of transcripts that have been found because he knows its all junk. I disagree. The null hypothesis is an hypothesis, not a fact [9]. One cannot assume it is true. That would be the 'null dogma'. He writes: "We don't know how many lncRNA genes there are in the human genome."! Exactly! We simply must know all the functional elements in the human genome. We can't afford to be ignorant about that. We'll never understand disease. We never know the difference between humans and chimps if we don't know all the functional DNA in both genomes [7]. In the year 2000 the human genome sequence was published. That was also a huge project costing millions of dollars [10]. But we should have an inventory not only of protein coding genes, but of all non-coding genes, all regulatory sequences and simply all functional DNA in the human genome. It would be irresponsible and unacceptable to stop that kind of research simply because ENCODE claims too much. We should find a way to do it efficiently.

A book as this comes with a bias. For example the concept: "functional pseudogenes" is absent! "There are only 200 human pseudogenes that produce a protein". Only! Are they not important or interesting? John Avise thinks functional pseudogenes are important enough to be counted as a "conceptual breakthrough" [5]. Whereas ENCODE is too optimistic about the importance of 90% of our genome, Moran is perhaps a little bit too pessimistic. [11]

 

Coping with a sloppy genome?

The Deflated Ego Problem is not solved by claiming it isn't a problem. We need more than that. Let me try: there could be no junk DNA if there would be not enough functional DNA. It is the non-junk DNA that keeps us alive. Simply the fact that there is 9 times more junk than functional DNA, does not prove that the 90% is the most important part. It may be that natural selection has no power over the 90%, but it rules over the 10%. And the 10% is the most important part of our genome. Thanks to the 10% we have rather unique brains. We are the only species that invented science and write books and blogs. Thanks to the 10%. Apparently, the junk burden is compatible with life. For now. We have learned to live with introns. We are not alone, salamanders and lungfishes have much more junk DNA in their genomes. Junk DNA is not cancer. Junk DNA doesn't make us sick, it doesn't hurt. Even Einstein must have had 90% junk DNA in his genome! Nobody ever died of junk DNA, etc. etc. etc.

In my own scientific education I learned about polyploidy in plants (huge redundancy!), constitutive heterochromatin in chromosomes, introns (unexpected discovery!) and repetitive DNA (ALU). But at the time it wasn't called junk DNA. In 2011 I wrote against the ID-view of introns. Although I knew these facts, I was not aware of the magnitude. I knew bits and pieces, but Moran made the sum total and put a number on the amount of junk. In that sense Moran created a 'conceptual breakthrough' (in the words of John Avise). In the past, authors who published about junk DNA never made the 90% claim explicitly. Moran places it prominently in the title of his book. In my next blog I will discuss  about junk DNA in Evolution textbooks. I discovered that I have been a victim of the ENCODE propaganda [6] and it seems this happend to John Avise too.  Moran succeeded after a prolonged struggle against a tribe of  Wikipedia geniuses to create a Junk DNA Wikipedia page. His book is not (yet) listed on that page. The Dutch Wikipedia page Junk DNA says that 'junk' is an inappropriate word. 


Note: I have used the KOBO e-book version of the book which has full-text search capability (invaluable!).


Appendix 1

After reading the book I did a 'junk DNA' search in Nature and the results were surprising. Years before the ENCODE publications in 2012 a seemingly endless number of articles showed up attacking the junk DNA concept:

All articles are attacking or even ridiculing the 'junk DNA' concept before the ENCODE publication 2012. That Nature published the ENCODE results in 2012 did not quite come out of the blue. The problem is that Nature did not give its readers a balanced presentation of different views in the scientific community. It is a very one-sided point of view. Why? Finding functional DNA in junk DNA is newsworthy and finding no function is a negative result which will not get published. It seems a natural unbalance. The people who discovered treasures in the junk are the heroes and revolutionaries and get published and interviewed. Just as Moran describes in his book. Notwithstanding severe criticism, Nature continued to propagate the anti-junk worldview after 2012. That is even more amazing!


Appendix 2


After reading Moran's book, I found it helpful to compare it with John Avise's
Conceptual Breakthroughs in Evolutionary Genetics [2] because Avise emphasizes different scientific discoveries and interprets them differently. Moran has a rather extreme point of view and is dismissive of the results of other scientists who argue there is less junk in our genome and more functional DNA. According to Avise the following discoveries are conceptual breakthrough discoveries. I selected the relevant breakthroughs.  

Before the discovery of jumping genes scientists viewed the genome as a stable entity. Mutation did change the sequence of a gene, but not the order of the genes on a chromosome, and the total number of genes. Jumping genes changed that view. A stable genome became a dynamic genome. Next is the discovery of repetitive elements. They originate by copy and paste, but it is unclear what their function is, if any. Then came the subversive idea that many small mutations in DNA are ignored by natural selection because they do not affect fitness (neutralism theory). Not everything is useful. At that time it was thought that changes in protein coding genes are the most important drivers of evolution. But evidence began to accumulate that gene regulation of the protein coding genes was important in the evolution of eukaryotic species. The idea of gene regulation by RNA was a paradigm shift in molecular genetics. "Nearly all evolutionary geneticists now fully accept the idea that changes in the regulatory apparatus of eukaryotic genomes are central to much of adaptive evolution." Next revolution: human genomic uniqueness. Before this revolution, humans were extraordinarily distinct from all other creatures (Inflated Ego!). So, our genome must be unique. But then came the discovery that genes and proteins of humans and chimpanzees are remarkably similar. A shock for many people. The standard paradigm at that time was that most genes within any genome collaborate harmoniously for the benefit of the organism. But Selfish genes multiply within our genome much like viruses without benefit to the organism. We knew about selfish genes, but after the complete sequence of the human genome in 2000 the amount of selfish DNA was new. The view of a protein coding gene before the next revolution was an uninterrupted sequences of the 4 bases from a START to a STOP codon. The discovery of Split genes came as a shock. I remember this very well. It looked a crazy idea. Protein coding genes were interrupted with introns: pieces of DNA that are spliced out of the RNA transcript before producing a protein. Completely unnecessary complexity. Introns are junk DNA. They are not the exception, they are the rule. Even worse: introns are larger than the protein coding genes themselves! The total amount of useless introns in our genome is larger than the useful genes. It takes some time to get used to it. Now, they are in every textbook. The next unanticipated discovery was that RNA could catalyze biochemical reactions: Ribozymes. A task that enzymes (proteins) had a monopoly on. The point is that RNA wasn't just an intermediate between DNA and proteins anymore, it is a functional end product.

Additionally it was discovered that non coding RNAs play an important role in gene regulation: Regulatory RNAs. According to Avise the discovery of regulatory RNAs are a conceptual breakthrough. They come in two types: short micro-RNAs and long RNAs. According to Moran there are not many regulatory RNAs, but according to the ENCODE people there are thousands. Remember their claim: 80% of our genome is functional and Long non coding RNAs (lncRNAs) are a major player (?). Obviously, one cannot estimate the percentage of junk in a genome if one doesn't know the complete sequence. The Whole-Genome sequence, published in 2000, made it possible to calculate the percentage of junk in the complete genome. One last breakthrough I want to mention is Functional pseudogenes (a paradoxical concept!). Until that discovery pseudogenes were considered damaged copies of functional protein coding genes. Just junk. The conceptual revolution was that some pseudogenes appeared to regulate gene expression. 'Pseudogenes are not pseudo any more.'  At least not all.


Appendix 3

There is one scientist who back in 1994 published a book completely based upon the hypothesis that much of our genome is random DNA. He used the concept 'junk DNA'. His name is Periannan Senapathy and the full title of his book is: "Independent Birth of Organisms. A New Theory That Distinct Organisms Arose Independently From The Primordial Pond Showing That Evolutionary Theories Are Fundamentally Incorrect". In his own words:

"the genome would be mostly random DNA sequence with only small “islands” of genes scattered in an ocean of meaningless DNA. Such an architecture actually exists in the genomes of all living multicellular organisms, with the intergenic sequences termed “junk DNA.” "

Beautiful words! (with hindsight!). I reviewed the book here on my WDW website. I will blog about him soon with a focus on junk DNA:

Periannan Senapathy (1994) claimed that the human genome consists of more than 90% junk DNA 4 July 2023

 


Postscript 8 Nov 2023

Lesson from a 50% synthetic yeast genome:

"One such source [of instability] is large stretches of repetitive DNA that don’t code for anything, but that can recombine with each other through natural processes, causing major structural changes in the genome. The synthetic biologists want to have complete control of their engineered yeast, so the team combed through S. cerevisiae’s genome with computer programs to find highly repetitive regions — and then deleted them. These sequences are effectively “genome parasites”, Boeke says.". Nature 8 Nov 2023

So, there are talking about removing junk DNA because it is causing major structural changes in the genome. Those sequences must be highly non-adaptive or deleterious. If they are not neutral, why are they not removed from the genome by natural selection? Low population size can't be an explanation in yeast.

 

 

Notes

  1. Jonathan Marks (2003) "What it means to be 98% chimpanzee. - Apes, people, and their genes.". Listed on my WDW page here.
  2. John Avise (2014) Conceptual Breakthroughs in Evolutionary Genetics: A Brief History of Shifting Paradigms.
  3. Masatoshi Nei (2013) 'Mutation-driven Evolution', page 107. However, Nei himself writes: "Recent studies have shown that a substantial portion of noncoding DNA has some roles in the regulation" and he refers to the ENCODE Project Consortium 2012 !
  4. Chris Organ et al (2007) Origin of avian genome size and structure in non-avian dinosaurs, Nature.
  5. John Avise (2014): "Beginning in the 1980s, examples gradually came to light in which some "pseudogenes" played active roles, such as in regulating gene expression", Chapter 68 'Functional Pseudogenes',  (p.143).
  6. In my review of Senapathy's book Independent Origin I quoted ENCODE results several times without knowing how controversial they were!
  7. Moran writes only about differences in junk DNA between humans and chimps: "almost all the differences between the human and chimp genomes are due to the fixation of neutral mutations by random genetic drift." (chapter 4)
  8. Genome size
  9. It is not a fact, but the "90% of your genome is junk" is stated as a fact: it is in the title of his book! and is many times repeated in his book. [12 Jul 2023]
  10. "But sequencing the full genome was widely considered to be pointless, partly because around 95 percent of it was thought to be junk DNA.".
    Michel Morange (2020) 'The Black Box of Biology. A History of the Molecular Revolution', p.341.
  11. For example, this publication is absent in his book: Introns: The Functional Benefits of Introns in Genomes, Genomics & Informatics, 2015. Ideally, one should discuss publications against your own theory. [23 Jul 2023]
  12. Later I found a remarkable passage that shows Moran accepts the idea that junk could be a burden: "This evidence suggests that there has been selection for removing excess junk DNA from these introns in order to speed up gene expression." (chapter 6). So, Moran concludes from this that introns are mostly junk, and at the same time accepts that intron-junk is costly for highly expressed genes. Apparently, in this case natural selection is powerful enough to get rid of junk despite the small effective population size. But if population size is not a problem for removing junk DNA from introns, then the argument of population size doesn't seem to be valid anymore. The question arises: why hasn't most of the junk been removed from our genome? Is intron-junk  more costly than intergenic junk? [ 11 Aug 2023 ]. See also: Selection for short introns in highly expressed genes.