Base Composition, Codon Usage, and Patterns of Gene Sequence Evolution in Butterflies.

Näsvall K, Boman J, Talla V, Backström N

Genome Biol Evol 15 (8) - [2023-08-01; online 2023-08-11]

Coding sequence evolution is influenced by both natural selection and neutral evolutionary forces. In many species, the effects of mutation bias, codon usage, and GC-biased gene conversion (gBGC) on gene sequence evolution have not been detailed. Quantification of how these forces shape substitution patterns is therefore necessary to understand the strength and direction of natural selection. Here, we used comparative genomics to investigate the association between base composition and codon usage bias on gene sequence evolution in butterflies and moths (Lepidoptera), including an in-depth analysis of underlying patterns and processes in one species, Leptidea sinapis. The data revealed significant G/C to A/T substitution bias at third codon position with some variation in the strength among different butterfly lineages. However, the substitution bias was lower than expected from previously estimated mutation rate ratios, partly due to the influence of gBGC. We found that A/T-ending codons were overrepresented in most species, but there was a positive association between the magnitude of codon usage bias and GC-content in third codon positions. In addition, the tRNA-gene population in L. sinapis showed higher GC-content at third codon positions compared to coding sequences in general and less overrepresentation of A/T-ending codons. There was an inverse relationship between synonymous substitutions and codon usage bias indicating selection on synonymous sites. We conclude that the evolutionary rate in Lepidoptera is affected by a complex interaction between underlying G/C -> A/T mutation bias and partly counteracting fixation biases, predominantly conferred by overall purifying selection, gBGC, and selection on codon usage.

Bioinformatics Compute and Storage [Service]

NGI Short read [Service]

NGI Stockholm (Genomics Production) [Service]

National Genomics Infrastructure [Service]

PubMed 37565492

DOI 10.1093/gbe/evad150

Crossref 10.1093/gbe/evad150

pmc: PMC10462419
pii: 7241093

Publications 9.5.0