Larry Moran (2023) 'What's in Your Genome? 90% of Your Genome Is Junk' |
"The scientist says: science has explained many things about the universe.
Your life has no meaning. Have a nice day." [1]
Biochemist Larry Moran brings more bad news: 90% of your genome has no meaning! Even worse: the remaining 10% is not a finely tuned Swiss watch but prone to mistakes and errors! Have a nice day!
This is really not good for your self-esteem: The Deflated Ego Problem, as Moran calls it. He himself doesn't seem to be worried
at all. He, like many molecular biochemists and geneticists, had many years to get used to
the idea and trying to understand it. They are comfortable with the idea
that large genomes are full of junk DNA. Moran eliminated God and natural selection for 90%. Read the book and decide for yourself if he is right. This book is not the usual popular
science book. Moran is not a science journalist. The book has a high density of facts and arguments and no anecdotes.
It is a book length and systematic defense of the junk DNA view. Whether you agree or not with Moran, after reading the book your view of the human genome (and some other things) will never be the same again.
In 2012 a large international group of scientists, called the ENCODE consortium, claims that 80% of our genome has some biochemical function. Junk is still there, but they reduced it to 20%. Top science journals Nature and Science published their results and opinions and seem to agree. Moran strongly disagrees. He argues that long before scientists were able to sequence the first whole genome (that is before 2000) there was evidence that much of our genome simply must be junk. I was not aware of some of those facts (or have forgotten it?). Amazingly, the ENCODE scientists didn't know those facts either and accused the 'junk DNA scientists' of having based their claims on ignorance because you cannot assume that all that DNA is junk without a thorough investigation. But according to Moran, the opposite is true: the anti-junk scientists are ignorant of a lot of evidence. Clearly it is necessary to know the history of the field. Moran explains that history. Read it.
Natural selection
Evolutionary biology was forced to adapt to a constant stream of new
discoveries in biochemistry and genomics. The author of
What's in Your Genome? rejects the Darwinian view that every
property of an organism must have a function and must have been selected
by natural selection (a view called adaptionism or pan-adaptionism).
Remarkably, he follows paleontologist S. J. Gould in his criticism of
pan-adaptionism. Moran has the rather extreme view that "most of evolution has nothing to do with natural selection". I hope he meant to say: "most of evolution at the molecular level has nothing to do with
natural selection." But, I am not sure. In defense of Darwin, I must say
that Darwin was concerned with organisms and the goal of his theory was to
explain adaptations of organisms. In Darwin's time there was next to
nothing known about genetics, let alone of DNA. So, Darwinism explains
features of organisms (beaks, claws, eyes, wings, brains) as adaptations
created by natural selection. Potential readers of
What's in Your Genome? must be aware that the book is not about the precious 10% of our DNA that
makes us human. It is not surprising that a biochemist focuses on
non-adaptive features of organisms. Especially a biochemist who defends that 90% of
your DNA is non-functional. There is so much noise in biochemical
reactions, they are very messy and there is so much needless complexity at
the genomic and cellular level.
Again unsurprisingly, Moran has sympathy
for the idea that mutation is a prime mover in evolution, an idea
called 'mutationism' and argued by Masatoshi Nei [3]. It is
reasonable to claim that the 90% junk is explained by mutation and random
drift and further that natural selection is powerless to remove the junk. However, it is
unreasonable to extrapolate that to a general rule in evolution and call
it mutationism. Let me illustrate that by a thought experiment of
Masatoshi Nei approvingly quoted by Moran:
"Nei emphasizes the importance of mutation by asking us to consider a parallel universe - one with natural selection but no mutation and one with mutations but no natural selection."
In the first universe there will be no evolution, in the second there will still be evolution since new genes and new variants will appear and their
frequency in the populations can increase by random genetic drift. This is Nei's explanation. But it is nonsense. It is a false dilemma. It is just as silly as asking which of
the legs of a table is the most important. Evolution needs both
mutation and natural selection. Secondly, Nei simply assumes organisms with
functional genes at the start of his thought experiment. The thought experiment will not work if
there are no organisms in the first place. How did those organisms
originate? By random genetic drift? Genomes will mutate to extinction if the fitter genomes will not be amplified relative to the less fitter genomes. Any existing life form is proof that natural selection has created functional genes. In both hypothetical universes there would be no living organisms to start with. Conclusion:
mutationism cannot possibly be a general theory of evolution. Despite his endorsement of mutationism, Moran knows that natural selection is indispensable as an important indication of function of a DNA or RNA sequence. The fact that specific sequences are conserved between species shows they are functional and must have been selected. Moran knows that.
Is junk DNA a burden?
Genome size distribution Figure 2.1 Chapter 2. Note: birds (green) have smaller genomes than mammals (red) |
A valid question is: why didn't natural selection remove most of the junk from our genomes? The amount of junk seems disproportionate. The standard explanation (also adopted by Moran) is: 'small population size'. Species with large population sizes have less junk (bacteria). Animals with smaller populations sizes have lots of junk. Is that a universal law? Is it the correct explanation? But what about birds? "Avian genomes are small and streamlined compared with those of other amniotes by virtue of having fewer repetitive elements and less non-coding DNA" [4]. Is that because they have larger population sizes or could it be that the energetic burden of large amounts of junk is too much for birds? Moran writes: "The pufferfish genome is only one-eighth the size of our genome, and the lungfish genome is 40 times larger than the human genome". Can this be explained by different population sizes? Are effective population sizes known? Are there 8 times as many pufferfishes than humans? Prokaryotic genomes range from about 500 kb to about 12 Mb [8]: does that correlate with population sizes?
The existence of huge amounts of useless DNA conflicts with the evolutionary principle "use it or lose it." (cavefish lost their eyes). Useless DNA is copied and repaired by a complex repair system at every cell division. "Most of our genome is transcribed" also adds to the burden. These spurious transcripts must be removed by RNA quality control, surveillance and degradation machinery. Again, this adds to the burden. Introns have to be degraded. "Incorrectly spliced RNAs are rapidly degraded by enzymes that clean up mistakes": this adds to the burden as well. Furthermore, "Some pseudogenes will be transcribed and still produce a protein. This does not mean they have a biological function." Again this adds to the burden. I wished Moran had discussed these questions. Too easily he accepts the population size argument and doesn't take the burden problem serious. To me it seems appropriate to discuss the burden problem in a book length defense of junk DNA [12].
Null hypothesis or null dogma?
It's clear this is a polemical book. It is a very forceful criticism of ENCODE and everyone who uncritically accepts and spreads their views including Nature and Science. I agree that this criticism is necessary. However, there is a downside. Moran writes that the ENCODE research goals of documenting all transcripts in the human genome was a waste of money. Only a relatively small group of transcripts have a proven biological function ("only 1000 lncRNAs out of 60,000 were conserved in mammals"; "the number with a proven function is less than 500 in humans"; "The correct null hypothesis is that these long noncoding RNAs are examples of noisy transcription", or junk RNA"). Furthermore, Moran also thinks it is a waste of time and money to identify the functions of the thousands of transcripts that have been found because he knows its all junk. I disagree. The null hypothesis is an hypothesis, not a fact [9]. One cannot assume it is true. That would be the 'null dogma'. He writes: "We don't know how many lncRNA genes there are in the human genome."! Exactly! We simply must know all the functional elements in the human genome. We can't afford to be ignorant about that. We'll never understand disease. We never know the difference between humans and chimps if we don't know all the functional DNA in both genomes [7]. In the year 2000 the human genome sequence was published. That was also a huge project costing millions of dollars [10]. But we should have an inventory not only of protein coding genes, but of all non-coding genes, all regulatory sequences and simply all functional DNA in the human genome. It would be irresponsible and unacceptable to stop that kind of research simply because ENCODE claims too much. We should find a way to do it efficiently.
A book as this comes with a bias. For example the concept: "functional pseudogenes" is absent! "There are only 200 human pseudogenes that produce a protein". Only! Are they not important or interesting? John Avise thinks functional pseudogenes are important enough to be counted as a "conceptual breakthrough" [5]. Whereas ENCODE is too optimistic about the importance of 90% of our genome, Moran is perhaps a little bit too pessimistic. [11]
Coping with a sloppy genome?
The Deflated Ego Problem is not solved by claiming it isn't a problem. We need more than that. Let me try: there could be no junk DNA if there would be not
enough functional DNA. It is the non-junk DNA that keeps us alive. Simply the fact that there is 9 times more junk than functional DNA, does not prove that the 90% is the most important part. It may be that natural selection has no power over the 90%, but it rules over the 10%. And the 10% is the most important part of our genome. Thanks to the 10% we have rather unique brains. We are the only species that invented science and write books and blogs. Thanks to the 10%. Apparently, the junk burden is compatible with life. For now. We have learned to live with introns. We are not alone, salamanders and lungfishes have much more junk DNA in their
genomes. Junk DNA is not cancer. Junk DNA doesn't make us sick, it doesn't hurt. Even Einstein must
have had 90% junk DNA in his genome! Nobody ever died of junk DNA, etc. etc. etc.
In my own scientific education I learned about polyploidy in plants (huge redundancy!), constitutive heterochromatin in chromosomes, introns (unexpected discovery!) and repetitive DNA (ALU). But at the time it wasn't called junk DNA. In 2011 I wrote against the ID-view of introns. Although I knew these facts, I was not aware of the magnitude. I knew bits and pieces, but Moran made the sum total and put a number on the amount of junk. In that sense Moran created a 'conceptual breakthrough' (in the words of John Avise). In the past, authors who published about junk DNA never made the 90% claim explicitly. Moran places it prominently in the title of his book. In my next blog I will discuss about junk DNA in Evolution textbooks. I discovered that I have been a victim of the ENCODE propaganda [6] and it seems this happend to John Avise too. Moran succeeded after a prolonged struggle against a tribe of Wikipedia geniuses to create a Junk DNA Wikipedia page. His book is not (yet) listed on that page. The Dutch Wikipedia page Junk DNA says that 'junk' is an inappropriate word.
Note: I have used the KOBO e-book version of the book which has full-text search capability (invaluable!).
Appendix 1
After reading the book I did a 'junk DNA' search in Nature and the results were surprising. Years before the ENCODE publications in 2012 a seemingly endless number of articles showed up attacking the junk DNA concept:
- When the junk isn't junk, 1996
- The meaning of junk, 2001
- A forage in the junkyard, 2002
- 'Junk' DNA reveals vital role, 2004
- A use for junk, 2004
- Fruitfly genome is not junk, 2005
- What's in the 'junk' of the genome? 2005
- Junk DNA as an evolutionary force, 2006
- It's the junk that makes us human, 2006
- Junk, 2007
-
Marsupial genome reveals the treasure hidden in junk DNA.
2007
- Rethinking junk DNA, 2009
-
Junk DNA promotes sex chromosome evolution, 2009
-
Junk DNA holds clues to heart disease, 2010
All articles are attacking or even ridiculing the 'junk DNA' concept before the ENCODE publication 2012. That Nature published the ENCODE results
in 2012 did not quite come out of the blue. The problem is that Nature did not give its readers a balanced presentation
of different views in the scientific community. It is a very one-sided
point of view. Why? Finding functional DNA in junk DNA is newsworthy and
finding no function is a negative result which will not get published. It
seems a natural unbalance. The people who discovered treasures in the junk
are the heroes and revolutionaries and get published and interviewed. Just
as Moran describes in his book. Notwithstanding severe criticism,
Nature continued to propagate the anti-junk worldview
after 2012. That is even more amazing!
Appendix 2
After reading Moran's book, I found it helpful to compare it with John Avise's Conceptual Breakthroughs in Evolutionary Genetics [2] because Avise emphasizes different scientific discoveries and interprets them differently. Moran has a rather extreme point of view and is dismissive of the results of other scientists who argue there is less junk in our genome and more functional DNA. According to Avise the following discoveries are conceptual breakthrough discoveries. I selected the relevant breakthroughs.
Before the discovery of jumping genes scientists viewed the genome as a stable entity. Mutation did change the sequence of a gene, but not the order of the genes on a chromosome, and the total number of genes. Jumping genes changed that view. A stable genome became a dynamic genome. Next is the discovery of repetitive elements. They originate by copy and paste, but it is unclear what their function is, if any. Then came the subversive idea that many small mutations in DNA are ignored by natural selection because they do not affect fitness (neutralism theory). Not everything is useful. At that time it was thought that changes in protein coding genes are the most important drivers of evolution. But evidence began to accumulate that gene regulation of the protein coding genes was important in the evolution of eukaryotic species. The idea of gene regulation by RNA was a paradigm shift in molecular genetics. "Nearly all evolutionary geneticists now fully accept the idea that changes in the regulatory apparatus of eukaryotic genomes are central to much of adaptive evolution." Next revolution: human genomic uniqueness. Before this revolution, humans were extraordinarily distinct from all other creatures (Inflated Ego!). So, our genome must be unique. But then came the discovery that genes and proteins of humans and chimpanzees are remarkably similar. A shock for many people. The standard paradigm at that time was that most genes within any genome collaborate harmoniously for the benefit of the organism. But Selfish genes multiply within our genome much like viruses without benefit to the organism. We knew about selfish genes, but after the complete sequence of the human genome in 2000 the amount of selfish DNA was new. The view of a protein coding gene before the next revolution was an uninterrupted sequences of the 4 bases from a START to a STOP codon. The discovery of Split genes came as a shock. I remember this very well. It looked a crazy idea. Protein coding genes were interrupted with introns: pieces of DNA that are spliced out of the RNA transcript before producing a protein. Completely unnecessary complexity. Introns are junk DNA. They are not the exception, they are the rule. Even worse: introns are larger than the protein coding genes themselves! The total amount of useless introns in our genome is larger than the useful genes. It takes some time to get used to it. Now, they are in every textbook. The next unanticipated discovery was that RNA could catalyze biochemical reactions: Ribozymes. A task that enzymes (proteins) had a monopoly on. The point is that RNA wasn't just an intermediate between DNA and proteins anymore, it is a functional end product.
Additionally it was discovered that non coding RNAs play an important role in
gene regulation: Regulatory RNAs. According to Avise the discovery of
regulatory RNAs are a conceptual breakthrough. They come in two types: short
micro-RNAs and long RNAs. According to Moran there are not many regulatory
RNAs, but according to the ENCODE people there are thousands. Remember their
claim: 80% of our genome is functional and Long non coding RNAs (lncRNAs) are
a major player (?). Obviously, one cannot estimate the percentage of junk in a
genome if one doesn't know the complete sequence. The
Whole-Genome sequence, published in 2000, made it possible to calculate
the percentage of junk in the complete genome. One last breakthrough I want to
mention is Functional pseudogenes (a paradoxical concept!). Until that discovery pseudogenes
were considered damaged copies of functional protein coding genes. Just junk.
The conceptual revolution was that some pseudogenes appeared to regulate gene
expression. 'Pseudogenes are not pseudo any more.' At least not all.
Appendix 3
There is one scientist who back in 1994 published a book completely based upon the hypothesis that much of our genome is random DNA. He used the concept 'junk DNA'. His name is Periannan Senapathy and the full title of his book is: "Independent Birth of Organisms. A New Theory That Distinct Organisms Arose Independently From The Primordial Pond Showing That Evolutionary Theories Are Fundamentally Incorrect". In his own words:
"the genome would be mostly random DNA sequence with only small “islands” of genes scattered in an ocean of meaningless DNA. Such an architecture actually exists in the genomes of all living multicellular organisms, with the intergenic sequences termed “junk DNA.” "
Beautiful words! (with hindsight!). I reviewed the book here on my WDW website. I will blog about him soon with a focus on junk DNA:
Postscript 8 Nov 2023
Lesson from a 50% synthetic yeast genome:
"One such source [of instability] is large stretches of repetitive DNA that don’t code for anything, but that can recombine with each other through natural processes, causing major structural changes in the genome. The synthetic biologists want to have complete control of their engineered yeast, so the team combed through S. cerevisiae’s genome with computer programs to find highly repetitive regions — and then deleted them. These sequences are effectively “genome parasites”, Boeke says.". Nature 8 Nov 2023
So, there are talking about removing junk DNA because it is causing major structural changes in the genome. Those sequences must be highly non-adaptive or deleterious. If they are not neutral, why are they not removed from the genome by natural selection? Low population size can't be an explanation in yeast.
Notes
- Jonathan Marks (2003) "What it means to be 98% chimpanzee. - Apes, people, and their genes.". Listed on my WDW page here.
- John Avise (2014) Conceptual Breakthroughs in Evolutionary Genetics: A Brief History of Shifting Paradigms.
- Masatoshi Nei (2013) 'Mutation-driven Evolution', page 107. However, Nei himself writes: "Recent studies have shown that a substantial portion of noncoding DNA has some roles in the regulation" and he refers to the ENCODE Project Consortium 2012 !
- Chris Organ et al (2007) Origin of avian genome size and structure in non-avian dinosaurs, Nature.
- John Avise (2014): "Beginning in the 1980s, examples gradually came to light in which some "pseudogenes" played active roles, such as in regulating gene expression", Chapter 68 'Functional Pseudogenes', (p.143).
- In my review of Senapathy's book Independent Origin I quoted ENCODE results several times without knowing how controversial they were!
- Moran writes only about differences in junk DNA between humans and chimps: "almost all the differences between the human and chimp genomes are due to the fixation of neutral mutations by random genetic drift." (chapter 4)
- Genome size
- It is not a fact, but the "90% of your genome is junk" is stated as a fact: it is in the title of his book! and is many times repeated in his book. [12 Jul 2023]
- "But sequencing the full genome was widely considered to be pointless, partly because around 95 percent of it was thought to be junk DNA.".
Michel Morange (2020) 'The Black Box of Biology. A History of the Molecular Revolution', p.341. - For example, this publication is absent in his book: Introns: The Functional Benefits of Introns in Genomes, Genomics & Informatics, 2015. Ideally, one should discuss publications against your own theory. [23 Jul 2023]
- Later I found a remarkable passage that shows Moran accepts the idea that junk could be a burden: "This evidence suggests that there has been selection for removing excess junk DNA from these introns in order to speed up gene expression." (chapter 6). So, Moran concludes from this that introns are mostly junk, and at the same time accepts that intron-junk is costly for highly expressed genes. Apparently, in this case natural selection is powerful enough to get rid of junk despite the small effective population size. But if population size is not a problem for removing junk DNA from introns, then the argument of population size doesn't seem to be valid anymore. The question arises: why hasn't most of the junk been removed from our genome? Is intron-junk more costly than intergenic junk? [ 11 Aug 2023 ]. See also: Selection for short introns in highly expressed genes.