Transposon activity, local duplications and propagation of structural variants across haplotypes drive the evolution of the Drosophila S2 cell line.

Lewerentz J, Johansson AM, Larsson J, Stenberg P

BMC Genomics 23 (1) 276 [2022-04-07; online 2022-04-07]

Immortalized cell lines are widely used model systems whose genomes are often highly rearranged and polyploid. However, their genome structure is seldom deciphered and is thus not accounted for during analyses. We therefore used linked short- and long-read sequencing to perform haplotype-level reconstruction of the genome of a Drosophila melanogaster cell line (S2-DRSC) with a complex genome structure. Using a custom implementation (that is designed to use ultra-long reads in complex genomes with nested rearrangements) to call structural variants (SVs), we found that the most common SV was repetitive sequence insertion or deletion (> 80% of SVs), with Gypsy retrotransposon insertions dominating. The second most common SV was local sequence duplication. SNPs and other SVs were rarer, but several large chromosomal translocations and mitochondrial genome insertions were observed. Haplotypes were highly similar at the nucleotide level but structurally very different. Insertion SVs existed at various haplotype frequencies and were unlinked on chromosomes, demonstrating that haplotypes have different structures and suggesting the existence of a mechanism that allows SVs to propagate across haplotypes. Finally, using public short-read data, we found that transposable element insertions and local duplications are common in other D. melanogaster cell lines. The S2-DRSC cell line evolved through retrotransposon activity and vast local sequence duplications, that we hypothesize were the products of DNA re-replication events. Additionally, mutations can propagate across haplotypes (possibly explained by mitotic recombination), which enables fine-tuning of mutational impact and prevents accumulation of deleterious events, an inherent problem of clonal reproduction. We conclude that traditional linear homozygous genome representation conceals the complexity when dealing with rearranged and heterozygous clonal cells.

Bioinformatics Compute and Storage [Service]

National Genomics Infrastructure [Service]

PubMed 35392795

DOI 10.1186/s12864-022-08472-1

Crossref 10.1186/s12864-022-08472-1

pmc: PMC8991648
pii: 10.1186/s12864-022-08472-1


Publications 7.2.7