Are Giant Viruses a Fourth Domain of Life?
Chris King


Offset against both the uniqueness of the mitochondrial endo-symbiosis and the closely linked, but independent question of the origin of the nucleus and nuclear envelope, has been the discovery of mimi-, mama-, mega- and pandora-viruses infecting amoeba (Raoult et, al., Philippe et al) and related very large aquatic viruses such as CroV infecting single celled plankton species (Fisher et. al.), which despite their recent discovery, appear from ocean gene analyses to be potentially ubiquitous and widespread in the oceans and possibly playing a crucial role in regulating the atmospheric-oceanic pathways, such as carbon sequestration. Fig 6 Left: Bacterium Gemmata obscuriglobus with internal nuclear envelope and vaccuoles (Rachel Melwig & Christine Panagiotidis / EMBL). Right: Ultrathin EM section of a mimivirus in amoeba (Jean-Michel Claverie) Inset: Mamavirus infected by sputnik phage.

These form an intermediate genetic position between viruses and cells, having the largest genomes, with extensive cellular machinery, including protein translation, and larger than the smallest completely autonomous bacterial and archaeal genomes.

Megavirus chilensis, for example is 10 to 20 times wider than the average virus. The particle measures about 0.7 micrometres (thousandths of a millimetre) in diameter. It just beats the previous record holder, Mimivirus, which was found in a water cooling tower in the UK in 1992. A study of the megavirus's DNA shows it to have more than a thousand genes. The mimivirus genome is a linear, double-stranded molecule of DNA with 1.18 Mbp in length. Megavirus has 1.25 Mbp. Like Mimivirus, Megavirus has hair-like structures, or fibrils, on the exterior of its shell, or capsid, that probably attract unsuspecting amoebas looking to prey on bacteria displaying similar features. These viruses show many characteristics at the boundary of living and non-living. They are as large as several bacterial species, such as Rickettsia conorii and Tropheryma whipplei, possess a genome of comparable size to several bacteria, including those above, and code for products previously not thought to be encoded by viruses. Mimivirus has genes coding for nucleotide and amino acid synthesis, which even some small obligate intracellular bacteria lack. However, it lacks genes for ribosomal proteins, making it dependent on a host cell for protein translation and energy metabolism.

As of mid-2013, an even larger virus with a 2.5 Mb genome without morphological or genomic resemblance to any previously defined virus families has been discovered by the same researchers that found mimivirus, in both the same ocean sample off Peru and in a freshwater pond in Australia. Named pandoravirus - reflecting their lack of similarity with previously described microorganisms and the surprises expected from their future study. The researchers suspect that giant viruses evolved from cells. They think that at some point, the dynasty on Earth was much bigger than the three domains of bacteria, archaea and eukaryotes. Some cells gave rise to modern life, and others survived by parasitizing them and evolving into viruses. Pandora might thus provide a complementary relic of the genomes of this wider founding group (Philippe et al). Using the Global Ocean Sampling (GOS) Expedition data to explore variants of recA (the universal DNA repair enzyme) and rpoB (the beta subunit of bacterial RNA polymerase) a team associated with Craig Venter have discovered branches which may also point to a fourth domain (Wu et al).

Fig 6c: Evolutionary tree of B-family DNA polymerase showing relationship of pandoravirus to other viruses and eucaryotes. Inset is shown pandoraviruses invading acanthamoeba (Philippe et al).

As an illustration of genes in mimivirus normally appearing only in cellular genomes, the mimivirus has genes for central protein-translation components, including four amino-acyl transfer RNA synthetases, peptide release factor 1, translation elongation factor EF-TU, and translation initiation factor 1. The genome also exhibits six tRNAs. Other notable features include the presence of both type I and type II topoisomerases, components of all DNA repair pathways, although the topoisomerase 1B has a different header structure from the eucaryote form (Brochier-Armanet, Gribaldo & Forterre 2008), many polysaccharide synthesis enzymes, and one intein-containing gene. Inteins are protein-splicing domains encoded by mobile intervening sequences (IVSs). They self-catalyze their excision from the host protein, ligating their former flanks by a peptide bond. They have been found in all domains of life (Eukaria, Archaea, and Eubacteria), but their distribution is highly sporadic. Only a few instances of viral inteins have been described. Self-splicing type I introns are a different type of mobile IVS, self-excising at the mRNA level. They are rare in viruses. Mimivirus exhibits four instances of self-excising intron, all in RNA polymerase genes.

Fig 6d: Evolutionary diversification of Mimiviruses from nucleocytoplasmic large DNA viruses (Fisher et. al.) and in relation to the three domains of cellular life based on the concatenated sequences of seven universally conserved protein sequences (Raoult et. al.)

Mamaviruses also host parasitic virophages, affectionately named sputnik (Pearson 2008) as viral satellites, which piggy back on the metabolism of the large viral factories set up by these giant viral genomes causing the mimiviruses to sicken, and these virophages also contains genes that are linked to viruses infecting each of the three domains of life Eukarya, Archaea and Bacteria (La Scola et. al.). It has thus been suggested that they have a primary role in the establishment of cellular life and that they may have been instrumental in the emergence of the nuclear envelope

Fig 6e: Evolutionary relationships between histone complexes and topoisomerase II of marseilleviruses places their incorporation as to or from an ancestor of LECA
before eucharyote histone structure had fully evolved (Erives 2017).

Yet another group of nucleocytoplasmic large DNA viruses (NCLDV) of eukaryotes, are typified by Marseille virus (Boyer et al. 2009) a giant virus of amoeba, prototypical of the family Marseilleviridae (MV) has been found to harbor core histone doublets consistent with incorporation from an ancient precursor of LECA the last common ancestor of eucaryotes. The genome of the virus is composed of typical NCLDV core genes and genes apparently obtained from eukaryotic hosts and their parasites or symbionts, both bacterial and viral. The virions of Marseillevirus encompass a 368-kb genome, a minimum of 49 proteins, and some messenger RNAs. The genetic sequences of the histone doublets places them at the root of the Eucaryote tree and DNA topoisomerases also present in the virus are likewise consistent with an origin close to the divergence of euarchaeota and eucaryotes (Erives 2017).

Fig 6e: Polintons (Mavericks) are large DNA transposons widespread in the genomes of eukaryotes, which also encode virus capsid proteins, suggesting they might form virions. The figure delineates relationships among bacterial tectiviruses, Polintons, adenoviruses, virophages, large and giant DNA viruses of eukaryotes of the proposed order 'Megavirales', and linear mitochondrial and cytoplasmic plasmids. The authors hypothesize that Polintons were the first group of eukaryotic double-stranded DNA viruses to evolve from bacteriophages and that they gave rise to most large DNA viruses of eukaryotes and various other selfish genetic elements (Krupovic M, Koonin E (2015) Polintons: a hotbed of eukaryotic virus, transposon and plasmid evolution Nature Reviews Microbiology 13 105-115).

The origins and evolutionary path of giant viruses remain unsettled. One side holds that the giant viruses evolved from smaller viruses over 2 billion years by adding genes, through processes such as horizontal gene transfer and gene duplication. The other maintains that the viruses started out large from the very beginning – and may even have been autonomous organisms – before losing genes they no longer needed and diversifying into the strains we see today.


Fig 6e: (Left) Genome bins of the Klosneuviruses. From outside to inside: In the first ring, solid circles indicate genes exclusively shared with nucleocytoplasmic large DNA viruses (NCLDVs) (blue), genes specific for Klosneuviruses (white), genes shared with eukaryotes (red), genes shared with Bacteria (green), genes represented in all three domains of cellular life (yellow), and singletons (gray). The second ring displays positions of genes (gray) either on the minus or the plus strand. The next track depicts GC content in shades of gray ranging from 20% (white) to 50% (dark gray). Links connect paralogs (gray) and nearly identical repeats (orange). (Right) Genome evolution of Klosneuviruses. A maximum likelihood tree from a concatenated alignment of five core nucleocytoplasmic virus orthologous genes (Schulz F. et al. (2017) Giant viruses with an expanded complement of translation system componentsScience 356, 82-85)

An investigation of a newly discovered group of extremely large viruses, the Klosneuviruses in metagenomic data from Austrian sewage whith genome sizes up to 3 Mb (Schultz et al. 2017) shows they have arisen through multiple aggregation events. Compared with other giant viruses, the Klosneuviruses encode an expanded translation machinery, including aminoacyl transfer RNA synthetases with specificities for all 20 amino acids. Notwithstanding the prevalence of translation system components, comprehensive phylogenomic analysis of these genes indicates that Klosneuviruses did not evolve from a cellular ancestor but rather are derived from a much smaller virus through extensive gain of host genes.

This picture is supported by the giant Bodo saltans virus (BsV), the first characterized representative of the most abundant giant viruses in the oceans, and the first klosneuvirus isolate, which infects an ecologically important microzooplankton, the kinetoplastid Bodo saltans, with a 1.39 Mb genome encodes 1227 predicted ORFs, including pathways for host-independent replication. Yet, much of its translational machinery has been lost, including all tRNAs. Essential genes are invaded by homing endonuclease-encoding self-splicing introns that may defend against competing viruses. Putative anti-host factors show extensive gene duplication via a genomic accordion indicating an ongoing evolutionary arms race and highlighting the rapid evolution and genomic plasticity that has led to genome gigantism and the enigma that is giant viruses (doi:10.1101/214536).

Fig 6f: Two newly discovered giant viruses Tupanvirus deep ocean and Tupanvirus soda lake the longest tailed Mimiviridae members isolated in amoebae. Their genomes are 1.44–1.51 Mb linear double-strand DNA coding for 1276–1425 predicted proteins. Tupanviruses share the same ancestors with mimivirus lineages and these giant viruses present the largest translational apparatus within the known virosphere, with up to 70 tRNA, 20 aaRS, 11 factors for all translation steps, and factors related to tRNA/mRNA maturation and ribosome protein modification. Moreover, two sequences with significant similarity to intronic regions of 18 S rRNA genes are encoded by the tupanviruses and highly expressed. In this translation-associated gene set, only the ribosome is lacking. Tupanviruses can infect a wide range of hosts, such as protists and amoebas, but pose no threat to humans (J. Abrahão et al. 2018 Tailed giant Tupanvirus possesses the most complete translational apparatus of the known virosphere. Nature Communications. doi:10.1038/s41467-018-03168-1.).

However, the picture with tupanviruses, which have many translation enzymes, is mixed, with some incorporated genes from other branches of life, but a large number shared only with other mimiviruses. Considering that tupanviruses comprise a sister group to amoebal mimiviruses, we can hypothesize that the ancestors of these clades of Mimiviridae could have had a more generalist lifestyle and were able to infect a wide variety of hosts. In this view, the ancestors of tupanviruses (and maybe of amoebal mimiviruses) might have already been giant viruses that underwent reductive evolution, although some genes could have been acquired over time, as previously hypothesized for other mimiviruses. A reductive evolution pattern is typical among obligatory intracellular parasites. In these cases, the organisms lose genes related to energy production, which is one of the main reasons for their obligatory parasitic lifestyle. In an alternative scenario, a simpler ancestor could had substantially acquired genes over time and became more resourceful, being able to infect a broader host range. Nevertheless, tupanvirus presents the most complete translational apparatus among viruses.

Again another study using reference genomes has found that numerous cultivated and uncultivated viruses encode ribosomal proteins (doi:10.1101/174177), putting these, as well as translation enzymes, in the viral orbit.