Abstract [eng] |
Ability to replicate and evolve are two distinct features of all living entities. Duplication of genetic information is carried out by replication proteins. Composition of DNA replication machinery is similar in all free-living cellular organisms. In contrast, replication in double-stranded (ds) DNA viruses is very diverse. It is well studied in T7 and T4 phages, herpes, polyoma and papilloma viruses, however, these groups make up only about 10% of known dsDNA viruses. How do lesser known viruses replicate? Do they use variations of already known replication systems? Or perhaps, they use novel replication strategies? DsDNA viruses are not only diverse, but they also vary in genome size. For example, genomes of smallest dsDNA viruses (polyoma) are 500 times smaller than that of the largest Pandora viruses (genome size – 2500 kbp). Genome size in free-living cellular organisms also varies. For example, genome size difference between human and the smallest eukaryote (Ostreococcus tauri) is ~260-fold. However, they have the same components of replication machinery. Is this true for dsDNA viruses? Or maybe, the diversity and genomic distribution of viral DNA replication proteins depends on virus genome size? We attempted to answer questions mentioned above by performing the first large scale computational study of all replication proteins from dsDNA viruses. Using current state-of-the-art computational methods we identified and characterized replication proteins (DNA polymerases, processivity factors, clamp loaders, primases, helicases, single-stranded DNA binding proteins, primer removal proteins, DNA ligases and topoisomerases) and analyzed their distribution patterns in genomes of dsDNA viruses. The analysis revealed dependency between DNA replicase components (DNA polymerases, processivity factors, clamp loaders) and the viral genome size. We found that small viruses (<40 kbp) use protein-primed DNA replication or rely on replication proteins from the host. Large viruses (>140 kbp) have their own RNA-primed replication apparatus often supplemented with processivity factors and sometimes by clamp loaders to increase replication speed and efficiency. The only seeming exception from the latter general pattern was eliminated after finding B-family DNA polymerases in large phiKZ phages. Next, we asked whether the distribution of other viral DNA replication proteins depends on genome size. It turned out that as the genome size increases viruses tend to encode their own replication proteins more frequently. Latter insight led us to a search for \"missing\" replication components in large genomes. This has resulted in the discovery of single-stranded DNA binding (SSB) proteins in largest eukaryotic viruses. Surprisingly, these proteins turned out to be homologs of SSB proteins previously thought to be specific for T7-like phages. Another surprise came from the analysis of DNA helicases. We found out that replicative helicases are the most common replication proteins in dsDNA viruses. In addition, our analysis revealed that the component of herpesviral helicase-primase complex (UL8) is a highly diverged and inactivated B-family DNA polymerase. |