smartPARE: An R Package for Efficient Identification of True mRNA Cleavage Sites.

Persson Hodén K, Hu X, Martinez G, Dixelius C

Int J Mol Sci 22 (8) 4267 [2021-04-20; online 2021-04-20]

Degradome sequencing is commonly used to generate high-throughput information on mRNA cleavage sites mediated by small RNAs (sRNA). In our datasets of potato (Solanum tuberosum, St) and Phytophthora infestans (Pi), initial predictions generated high numbers of cleavage site predictions, which highlighted the need of improved analytic tools. Here, we present an R package based on a deep learning convolutional neural network (CNN) in a machine learning environment to optimize discrimination of false from true cleavage sites. When applying smartPARE to our datasets on potato during the infection process by the late blight pathogen, 7.3% of all cleavage windows represented true cleavages distributed on 214 sites in P. infestans and 444 sites in potato. The sRNA landscape of the two organisms is complex with uneven sRNA production and cleavage regions widespread in the two genomes. Multiple targets and several cases of complex regulatory cascades, particularly in potato, was revealed. We conclude that our new analytic approach is useful for anyone working on complex biological systems and with the interest of identifying cleavage sites particularly inferred by sRNA classes beyond miRNAs.

NGI Uppsala (SNP&SEQ Technology Platform) [Service]

National Genomics Infrastructure [Service]

PubMed 33924042

DOI 10.3390/ijms22084267

Crossref 10.3390/ijms22084267

pii: ijms22084267
pmc: PMC8073297


Publications 9.5.1