{"entity": "journal", "iuid": "c996366f4ad943d1a677e96e156c558e", "timestamp": "2026-06-09T23:32:25.819Z", "links": {"self": {"href": "https://publications.scilifelab.se/journal/PLoS%20Comput.%20Biol..json"}, "display": {"href": "https://publications.scilifelab.se/journal/PLoS%20Comput.%20Biol."}}, "title": "PLoS Comput. Biol.", "issn": "1553-7358", "issn-l": "1553-734X", "publications_count": 14, "publications": [{"entity": "publication", "iuid": "394d345f587e4a8fa382227d3718c80e", "links": {"self": {"href": "https://publications.scilifelab.se/publication/394d345f587e4a8fa382227d3718c80e.json"}, "display": {"href": "https://publications.scilifelab.se/publication/394d345f587e4a8fa382227d3718c80e"}}, "title": "HAPP: High-accuracy pipeline for processing deep metabarcoding data.", "authors": [{"family": "Sundh", "given": "John", "initials": "J"}, {"family": "Granqvist", "given": "Emma", "initials": "E", "orcid": "0000-0002-1513-1674", "researcher": {"href": "https://publications.scilifelab.se/researcher/95b07f15f8724fdbbcdf34e6d6837147.json"}}, {"family": "Iwaszkiewicz-Eggebrecht", "given": "Ela", "initials": "E", "orcid": "0000-0003-1412-1711", "researcher": {"href": "https://publications.scilifelab.se/researcher/53c085bb455d44ceac2f050f5c38f683.json"}}, {"family": "Manoharan", "given": "Lokeshwaran", "initials": "L", "orcid": "0000-0001-9751-5745", "researcher": {"href": "https://publications.scilifelab.se/researcher/000321fd81b9457db66140246bbd9066.json"}}, {"family": "van Dijk", "given": "Laura J A", "initials": "LJA"}, {"family": "Goodsell", "given": "Robert", "initials": "R"}, {"family": "Godeiro", "given": "Nerivania N", "initials": "NN", "orcid": "0000-0002-1669-6124", "researcher": {"href": "https://publications.scilifelab.se/researcher/990e5c3362f94d76af29742ab5876a8a.json"}}, {"family": "Bellini", "given": "Bruno C", "initials": "BC"}, {"family": "Orsholm", "given": "Johanna", "initials": "J"}, {"family": "\u0141ukasik", "given": "Piotr", "initials": "P"}, {"family": "Miraldo", "given": "Andreia", "initials": "A"}, {"family": "Roslin", "given": "Tomas", "initials": "T"}, {"family": "Tack", "given": "Ayco J M", "initials": "AJM"}, {"family": "Andersson", "given": "Anders F", "initials": "AF", "orcid": "0000-0002-3627-6899", "researcher": {"href": "https://publications.scilifelab.se/researcher/caa76ee4438d4b4aad386ba8a90448c2.json"}}, {"family": "Ronquist", "given": "Fredrik", "initials": "F", "orcid": "0000-0002-3929-251X", "researcher": {"href": "https://publications.scilifelab.se/researcher/440662f277ea4756a08a7f5925b3f485.json"}}], "type": "journal article", "published": "2025-11-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "issn-l": "1553-734X", "volume": "21", "issue": "11", "pages": "e1013558"}, "abstract": "Deep metabarcoding offers an efficient and reproducible approach to biodiversity monitoring, but noisy data and incomplete reference databases challenge accurate diversity estimation and taxonomic annotation. Here, we introduce a novel algorithm, NEEAT, for removing spurious operational taxonomic units (OTUs) originating from nuclear-embedded mitochondrial DNA sequences (NUMTs) or sequencing errors. It integrates 'echo' signals across samples with the identification of unusual evolutionary patterns among similar DNA sequences. We also extensively benchmark current tools for chimera removal, taxonomic annotation and OTU clustering of deep metabarcoding data. The best performing tools/parameter settings are integrated into HAPP, a high-accuracy pipeline for processing deep metabarcoding data. Tests using CO1 data from BOLD and large-scale metabarcoding data on insects demonstrate that HAPP significantly outperforms existing methods, while enabling efficient analysis of extensive datasets by parallelizing computations across taxonomic groups.", "doi": "10.1371/journal.pcbi.1013558", "pmid": "41202092", "labels": {"National Genomics Infrastructure": "Service", "NGI Stockholm (Genomics Production)": "Service", "NGI Short read": "Service", "Bioinformatics (NBIS)": "Collaborative", "Bioinformatics Support and Infrastructure": "Collaborative", "Bioinformatics Support, Infrastructure and Training": "Collaborative"}, "xrefs": [{"db": "pmc", "key": "PMC12622834"}, {"db": "pii", "key": "PCOMPBIOL-D-25-00687"}], "notes": [], "created": "2025-11-21T11:45:11.315Z", "modified": "2025-11-21T12:27:09.804Z"}, {"entity": "publication", "iuid": "c04937c70b4a447b91d3d2aec9bf40bc", "links": {"self": {"href": "https://publications.scilifelab.se/publication/c04937c70b4a447b91d3d2aec9bf40bc.json"}, "display": {"href": "https://publications.scilifelab.se/publication/c04937c70b4a447b91d3d2aec9bf40bc"}}, "title": "CGG toolkit: Software components for computational genomics.", "authors": [{"family": "Vasileiou", "given": "Dimitrios", "initials": "D"}, {"family": "Karapiperis", "given": "Christos", "initials": "C"}, {"family": "Baltsavia", "given": "Ismini", "initials": "I"}, {"family": "Chasapi", "given": "Anastasia", "initials": "A"}, {"family": "Ahr\u00e9n", "given": "Dag", "initials": "D"}, {"family": "Janssen", "given": "Paul J", "initials": "PJ", "orcid": "0000-0002-7877-0270", "researcher": {"href": "https://publications.scilifelab.se/researcher/bbeadef1d79643c49619e484d1773adf.json"}}, {"family": "Iliopoulos", "given": "Ioannis", "initials": "I"}, {"family": "Promponas", "given": "Vasilis J", "initials": "VJ"}, {"family": "Enright", "given": "Anton J", "initials": "AJ"}, {"family": "Ouzounis", "given": "Christos A", "initials": "CA", "orcid": "0000-0002-0086-8657", "researcher": {"href": "https://publications.scilifelab.se/researcher/6e7eab67d5134be89b5f7cea5208e860.json"}}], "type": "journal article", "published": "2023-11-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "volume": "19", "issue": "11", "pages": "e1011498", "issn-l": "1553-734X"}, "abstract": "Public-domain availability for bioinformatics software resources is a key requirement that ensures long-term permanence and methodological reproducibility for research and development across the life sciences. These issues are particularly critical for widely used, efficient, and well-proven methods, especially those developed in research settings that often face funding discontinuities. We re-launch a range of established software components for computational genomics, as legacy version 1.0.1, suitable for sequence matching, masking, searching, clustering and visualization for protein family discovery, annotation and functional characterization on a genome scale. These applications are made available online as open source and include MagicMatch, GeneCAST, support scripts for CoGenT-like sequence collections, GeneRAGE and DifFuse, supported by centrally administered bioinformatics infrastructure funding. The toolkit may also be conceived as a flexible genome comparison software pipeline that supports research in this domain. We illustrate basic use by examples and pictorial representations of the registered tools, which are further described with appropriate documentation files in the corresponding GitHub release.", "doi": "10.1371/journal.pcbi.1011498", "pmid": "37934729", "labels": {"Bioinformatics Support and Infrastructure": "Collaborative", "Bioinformatics Support, Infrastructure and Training": "Collaborative", "Bioinformatics (NBIS)": "Collaborative"}, "xrefs": [{"db": "pmc", "key": "PMC10629618"}, {"db": "pii", "key": "PCOMPBIOL-D-23-00660"}], "notes": [], "created": "2023-12-01T10:26:42.813Z", "modified": "2023-12-01T10:26:42.887Z"}, {"entity": "publication", "iuid": "6e75301d789743ad907f297d66dc7bee", "links": {"self": {"href": "https://publications.scilifelab.se/publication/6e75301d789743ad907f297d66dc7bee.json"}, "display": {"href": "https://publications.scilifelab.se/publication/6e75301d789743ad907f297d66dc7bee"}}, "title": "Multienzyme deep learning models improve peptide de novo sequencing by mass spectrometry proteomics.", "authors": [{"family": "Gueto-Tettay", "given": "Carlos", "initials": "C", "orcid": "0000-0002-8496-6341", "researcher": {"href": "https://publications.scilifelab.se/researcher/b0e4a5377a88414588686e5630b0be66.json"}}, {"family": "Tang", "given": "Di", "initials": "D", "orcid": "0000-0001-6323-9375", "researcher": {"href": "https://publications.scilifelab.se/researcher/e2a27467c4094d9f8d8f8402efa333f4.json"}}, {"family": "Happonen", "given": "Lotta", "initials": "L"}, {"family": "Heusel", "given": "Moritz", "initials": "M", "orcid": "0000-0002-8506-530X", "researcher": {"href": "https://publications.scilifelab.se/researcher/c404e5fdd701412093882c1e48d0bc93.json"}}, {"family": "Khakzad", "given": "Hamed", "initials": "H", "orcid": "0000-0002-8556-0650", "researcher": {"href": "https://publications.scilifelab.se/researcher/a7b030c1452c4dc9b430deffdf493e6f.json"}}, {"family": "Malmstr\u00f6m", "given": "Johan", "initials": "J"}, {"family": "Malmstr\u00f6m", "given": "Lars", "initials": "L", "orcid": "0000-0001-9885-9312", "researcher": {"href": "https://publications.scilifelab.se/researcher/42e99f34fb854beb8d810b05fe941057.json"}}], "type": "journal article", "published": "2023-01-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "volume": "19", "issue": "1", "pages": "e1010457", "issn-l": "1553-734X"}, "abstract": "Generating and analyzing overlapping peptides through multienzymatic digestion is an efficient procedure for de novo protein using from bottom-up mass spectrometry (MS). Despite improved instrumentation and software, de novo MS data analysis remains challenging. In recent years, deep learning models have represented a performance breakthrough. Incorporating that technology into de novo protein sequencing workflows require machine-learning models capable of handling highly diverse MS data. In this study, we analyzed the requirements for assembling such generalizable deep learning models by systemcally varying the composition and size of the training set. We assessed the generated models' performances using two test sets composed of peptides originating from the multienzyme digestion of samples from various species. The peptide recall values on the test sets showed that the deep learning models generated from a collection of highly N- and C-termini diverse peptides generalized 76% more over the termini-restricted ones. Moreover, expanding the training set's size by adding peptides from the multienzymatic digestion with five proteases of several species samples led to a 2-3 fold generalizability gain. Furthermore, we tested the applicability of these multienzyme deep learning (MEM) models by fully de novo sequencing the heavy and light monomeric chains of five commercial antibodies (mAbs). MEMs extracted over 10000 matching and overlapped peptides across six different proteases mAb samples, achieving a 100% sequence coverage for 8 of the ten polypeptide chains. We foretell that the MEMs' proven improvements to de novo analysis will positively impact several applications, such as analyzing samples of high complexity, unknown nature, or the peptidomics field.", "doi": "10.1371/journal.pcbi.1010457", "pmid": "36668672", "labels": {"Structural Proteomics": "Technology development"}, "xrefs": [{"db": "pmc", "key": "PMC9891523"}, {"db": "pii", "key": "PCOMPBIOL-D-22-01182"}], "notes": [], "created": "2023-05-31T19:03:01.330Z", "modified": "2023-05-31T19:03:36.693Z"}, {"entity": "publication", "iuid": "415f1dd282a54f7c8bad566f70b7a805", "links": {"self": {"href": "https://publications.scilifelab.se/publication/415f1dd282a54f7c8bad566f70b7a805.json"}, "display": {"href": "https://publications.scilifelab.se/publication/415f1dd282a54f7c8bad566f70b7a805"}}, "title": "Combining biomarker and virus phylogenetic models improves HIV-1 epidemiological source identification.", "authors": [{"family": "Lundgren", "given": "Erik", "initials": "E", "orcid": "0000-0001-8840-8067", "researcher": {"href": "https://publications.scilifelab.se/researcher/68e46041cafc4984aa386bdf9ebf8648.json"}}, {"family": "Romero-Severson", "given": "Ethan", "initials": "E"}, {"family": "Albert", "given": "Jan", "initials": "J", "orcid": "0000-0001-9020-0521", "researcher": {"href": "https://publications.scilifelab.se/researcher/d23f55d392eb411d90a53c2fb22aced3.json"}}, {"family": "Leitner", "given": "Thomas", "initials": "T", "orcid": "0000-0001-8160-2588", "researcher": {"href": "https://publications.scilifelab.se/researcher/5f385fa284074e44a6d0b999bdb6d174.json"}}], "type": "journal article", "published": "2022-08-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "volume": "18", "issue": "8", "pages": "e1009741", "issn-l": "1553-734X"}, "abstract": "To identify and stop active HIV transmission chains new epidemiological techniques are needed. Here, we describe the development of a multi-biomarker augmentation to phylogenetic inference of the underlying transmission history in a local population. HIV biomarkers are measurable biological quantities that have some relationship to the amount of time someone has been infected with HIV. To train our model, we used five biomarkers based on real data from serological assays, HIV sequence data, and target cell counts in longitudinally followed, untreated patients with known infection times. The biomarkers were modeled with a mixed effects framework to allow for patient specific variation and general trends, and fit to patient data using Markov Chain Monte Carlo (MCMC) methods. Subsequently, the density of the unobserved infection time conditional on observed biomarkers were obtained by integrating out the random effects from the model fit. This probabilistic information about infection times was incorporated into the likelihood function for the transmission history and phylogenetic tree reconstruction, informed by the HIV sequence data. To critically test our methodology, we developed a coalescent-based simulation framework that generates phylogenies and biomarkers given a specific or general transmission history. Testing on many epidemiological scenarios showed that biomarker augmented phylogenetics can reach 90% accuracy under idealized situations. Under realistic within-host HIV-1 evolution, involving substantial within-host diversification and frequent transmission of multiple lineages, the average accuracy was at about 50% in transmission clusters involving 5-50 hosts. Realistic biomarker data added on average 16 percentage points over using the phylogeny alone. Using more biomarkers improved the performance. Shorter temporal spacing between transmission events and increased transmission heterogeneity reduced reconstruction accuracy, but larger clusters were not harder to get right. More sequence data per infected host also improved accuracy. We show that the method is robust to incomplete sampling and that adding biomarkers improves reconstructions of real HIV-1 transmission histories. The technology presented here could allow for better prevention programs by providing data for locally informed and tailored strategies.", "doi": "10.1371/journal.pcbi.1009741", "pmid": "36026480", "labels": {"Clinical Genomics Stockholm": "Service", "Bioinformatics Support for Computational Resources": "Service", "Clinical Genomics": "Service"}, "xrefs": [{"db": "pmc", "key": "PMC9455879"}, {"db": "pii", "key": "PCOMPBIOL-D-21-02230"}], "notes": [], "created": "2022-11-27T21:34:59.121Z", "modified": "2024-01-16T13:48:35.565Z"}, {"entity": "publication", "iuid": "6be7e5cf01494556bde5a433532cc836", "links": {"self": {"href": "https://publications.scilifelab.se/publication/6be7e5cf01494556bde5a433532cc836.json"}, "display": {"href": "https://publications.scilifelab.se/publication/6be7e5cf01494556bde5a433532cc836"}}, "title": "De novo spatiotemporal modelling of cell-type signatures in the developmental human heart using graph convolutional neural networks.", "authors": [{"family": "Marco Salas", "given": "Sergio", "initials": "S"}, {"family": "Yuan", "given": "Xiao", "initials": "X", "orcid": "0000-0001-9788-7717", "researcher": {"href": "https://publications.scilifelab.se/researcher/b115f1a0c1a844e89becb86996dbdf41.json"}}, {"family": "Sylven", "given": "Christer", "initials": "C"}, {"family": "Nilsson", "given": "Mats", "initials": "M", "orcid": "0000-0001-9985-0387", "researcher": {"href": "https://publications.scilifelab.se/researcher/197cf8ba83ba430f9712b2f4d94dc3e5.json"}}, {"family": "W\u00e4hlby", "given": "Carolina", "initials": "C", "orcid": "0000-0002-4139-7003", "researcher": {"href": "https://publications.scilifelab.se/researcher/c50194fbc8524d95b7152663ccf17f29.json"}}, {"family": "Partel", "given": "Gabriele", "initials": "G", "orcid": "0000-0002-4482-3119", "researcher": {"href": "https://publications.scilifelab.se/researcher/cca1e7c3e70a4d45aacc3a46d14b9cbe.json"}}], "type": "journal article", "published": "2022-08-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "issn-l": "1553-734X", "volume": "18", "issue": "8", "pages": "e1010366"}, "abstract": "With the emergence of high throughput single cell techniques, the understanding of the molecular and cellular diversity of mammalian organs have rapidly increased. In order to understand the spatial organization of this diversity, single cell data is often integrated with spatial data to create probabilistic cell maps. However, targeted cell typing approaches relying on existing single cell data achieve incomplete and biased maps that could mask the true diversity present in a tissue slide. Here we applied a de novo technique to spatially resolve and characterize cellular diversity of in situ sequencing data during human heart development. We obtained and made accessible well defined spatial cell-type maps of fetal hearts from 4.5 to 9 post conception weeks, not biased by probabilistic cell typing approaches. With our analysis, we could characterize previously unreported molecular diversity within cardiomyocytes and epicardial cells and identified their characteristic expression signatures, comparing them with specific subpopulations found in single cell RNA sequencing datasets. We further characterized the differentiation trajectories of epicardial cells, identifying a clear spatial component on it. All in all, our study provides a novel technique for conducting de novo spatial-temporal analyses in developmental tissue samples and a useful resource for online exploration of cell-type differentiation during heart development at sub-cellular image resolution.", "doi": "10.1371/journal.pcbi.1010366", "pmid": "35960757", "labels": {"BioImage Informatics": "Service", "Bioinformatics (NBIS)": "Service", "In Situ Sequencing": "Collaborative"}, "xrefs": [{"db": "pmc", "key": "PMC9401155"}, {"db": "pii", "key": "PCOMPBIOL-D-22-00047"}], "notes": [], "created": "2022-08-30T08:59:56.894Z", "modified": "2025-10-17T13:02:17.438Z"}, {"entity": "publication", "iuid": "ed41e1c9dfb3440fb7b5e40627afbbf6", "links": {"self": {"href": "https://publications.scilifelab.se/publication/ed41e1c9dfb3440fb7b5e40627afbbf6.json"}, "display": {"href": "https://publications.scilifelab.se/publication/ed41e1c9dfb3440fb7b5e40627afbbf6"}}, "title": "Machine learning-based investigation of the cancer protein secretory pathway.", "authors": [{"family": "Saghaleyni", "given": "Rasool", "initials": "R", "orcid": "0000-0003-0956-039X", "researcher": {"href": "https://publications.scilifelab.se/researcher/ebd08b713a894a6986d9101453f5ecd9.json"}}, {"family": "Sheikh Muhammad", "given": "Azam", "initials": "A", "orcid": "0000-0001-6037-7019", "researcher": {"href": "https://publications.scilifelab.se/researcher/a456bb9e432b4882814638c9019acbd2.json"}}, {"family": "Bangalore", "given": "Pramod", "initials": "P", "orcid": "0000-0002-5308-7061", "researcher": {"href": "https://publications.scilifelab.se/researcher/62e6651ffc4d4675bcec21d3754ad7b3.json"}}, {"family": "Nielsen", "given": "Jens", "initials": "J", "orcid": "0000-0002-9955-6003", "researcher": {"href": "https://publications.scilifelab.se/researcher/7a596e289be4438a8a2653b1f25fea8b.json"}}, {"family": "Robinson", "given": "Jonathan L", "initials": "JL", "orcid": "0000-0001-8567-5960", "researcher": {"href": "https://publications.scilifelab.se/researcher/b70b6d9b64fd45e882c4108aded013d4.json"}}], "type": "journal article", "published": "2021-04-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "volume": "17", "issue": "4", "pages": "e1008898", "issn-l": "1553-734X"}, "abstract": "Deregulation of the protein secretory pathway (PSP) is linked to many hallmarks of cancer, such as promoting tissue invasion and modulating cell-cell signaling. The collection of secreted proteins processed by the PSP, known as the secretome, is often studied due to its potential as a reservoir of tumor biomarkers. However, there has been less focus on the protein components of the secretory machinery itself. We therefore investigated the expression changes in secretory pathway components across many different cancer types. Specifically, we implemented a dual approach involving differential expression analysis and machine learning to identify PSP genes whose expression was associated with key tumor characteristics: mutation of p53, cancer status, and tumor stage. Eight different machine learning algorithms were included in the analysis to enable comparison between methods and to focus on signals that were robust to algorithm type. The machine learning approach was validated by identifying PSP genes known to be regulated by p53, and even outperformed the differential expression analysis approach. Among the different analysis methods and cancer types, the kinesin family members KIF20A and KIF23 were consistently among the top genes associated with malignant transformation or tumor stage. However, unlike most cancer types which exhibited elevated KIF20A expression that remained relatively constant across tumor stages, renal carcinomas displayed a more gradual increase that continued with increasing disease severity. Collectively, our study demonstrates the complementary nature of a combined differential expression and machine learning approach for analyzing gene expression data, and highlights key PSP components relevant to features of tumor pathophysiology that may constitute potential therapeutic targets.", "doi": "10.1371/journal.pcbi.1008898", "pmid": "33819271", "labels": {"Bioinformatics Support, Infrastructure and Training": "Collaborative", "Systems Biology": "Collaborative", "Bioinformatics (NBIS)": "Collaborative"}, "xrefs": [{"db": "pii", "key": "PCOMPBIOL-D-20-01453"}, {"db": "pmc", "key": "PMC8049480"}], "notes": [], "created": "2021-08-30T07:32:34.516Z", "modified": "2021-11-10T12:26:03.440Z"}, {"entity": "publication", "iuid": "03a9bb0f8fa14a829220610e1ac822c4", "links": {"self": {"href": "https://publications.scilifelab.se/publication/03a9bb0f8fa14a829220610e1ac822c4.json"}, "display": {"href": "https://publications.scilifelab.se/publication/03a9bb0f8fa14a829220610e1ac822c4"}}, "title": "Structural determination of Streptococcus pyogenes M1 protein interactions with human immunoglobulin G using integrative structural biology.", "authors": [{"family": "Khakzad", "given": "Hamed", "initials": "H", "orcid": "0000-0002-8556-0650", "researcher": {"href": "https://publications.scilifelab.se/researcher/a7b030c1452c4dc9b430deffdf493e6f.json"}}, {"family": "Happonen", "given": "Lotta", "initials": "L", "orcid": "0000-0002-5922-4549", "researcher": {"href": "https://publications.scilifelab.se/researcher/a7af0a6faf6a48e7937a644be472edbe.json"}}, {"family": "Karami", "given": "Yasaman", "initials": "Y", "orcid": "0000-0001-8413-2665", "researcher": {"href": "https://publications.scilifelab.se/researcher/a71f08b3f3794e66b7adae400c14f142.json"}}, {"family": "Chowdhury", "given": "Sounak", "initials": "S"}, {"family": "Bergdahl", "given": "Gizem Ert\u00fcrk", "initials": "GE", "orcid": "0000-0001-9719-6609", "researcher": {"href": "https://publications.scilifelab.se/researcher/f3671f82049b4e8e96fd45b0e0351642.json"}}, {"family": "Nilges", "given": "Michael", "initials": "M", "orcid": "0000-0002-1451-8092", "researcher": {"href": "https://publications.scilifelab.se/researcher/ebfd5336137d4eaabde8f4f9baed470a.json"}}, {"family": "Tran Van Nhieu", "given": "Guy", "initials": "G"}, {"family": "Malmstr\u00f6m", "given": "Johan", "initials": "J", "orcid": "0000-0002-2889-7169", "researcher": {"href": "https://publications.scilifelab.se/researcher/ad3c999da10c41e4a3afda2718815083.json"}}, {"family": "Malmstr\u00f6m", "given": "Lars", "initials": "L", "orcid": "0000-0001-9885-9312", "researcher": {"href": "https://publications.scilifelab.se/researcher/42e99f34fb854beb8d810b05fe941057.json"}}], "type": "journal article", "published": "2021-01-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "volume": "17", "issue": "1", "pages": "e1008169", "issn-l": "1553-734X"}, "abstract": "Streptococcus pyogenes (Group A streptococcus; GAS) is an important human pathogen responsible for mild to severe, life-threatening infections. GAS expresses a wide range of virulence factors, including the M family proteins. The M proteins allow the bacteria to evade parts of the human immune defenses by triggering the formation of a dense coat of plasma proteins surrounding the bacteria, including IgGs. However, the molecular level details of the M1-IgG interaction have remained unclear. Here, we characterized the structure and dynamics of this interaction interface in human plasma on the surface of live bacteria using integrative structural biology, combining cross-linking mass spectrometry and molecular dynamics (MD) simulations. We show that the primary interaction is formed between the S-domain of M1 and the conserved IgG Fc-domain. In addition, we show evidence for a so far uncharacterized interaction between the A-domain and the IgG Fc-domain. Both these interactions mimic the protein G-IgG interface of group C and G streptococcus. These findings underline a conserved scavenging mechanism used by GAS surface proteins that block the IgG-receptor (Fc\u03b3R) to inhibit phagocytic killing. We additionally show that we can capture Fab-bound IgGs in a complex background and identify XLs between the constant region of the Fab-domain and certain regions of the M1 protein engaged in the Fab-mediated binding. Our results elucidate the M1-IgG interaction network involved in inhibition of phagocytosis and reveal important M1 peptides that can be further investigated as future vaccine targets.", "doi": "10.1371/journal.pcbi.1008169", "pmid": "33411763", "labels": {"Structural Proteomics": "Collaborative"}, "xrefs": [{"db": "pii", "key": "PCOMPBIOL-D-20-01263"}, {"db": "pmc", "key": "PMC7817036"}], "notes": [], "created": "2021-12-03T12:29:48.339Z", "modified": "2021-12-03T12:29:48.505Z"}, {"entity": "publication", "iuid": "a5a1963fd516402bbc91e16b77379601", "links": {"self": {"href": "https://publications.scilifelab.se/publication/a5a1963fd516402bbc91e16b77379601.json"}, "display": {"href": "https://publications.scilifelab.se/publication/a5a1963fd516402bbc91e16b77379601"}}, "title": "A framework to assess the quality and impact of bioinformatics training across ELIXIR.", "authors": [{"family": "Gurwitz", "given": "Kim T", "initials": "KT", "orcid": "0000-0003-1992-5073", "researcher": {"href": "https://publications.scilifelab.se/researcher/3a5bf8910c22409fa3c8121755388d0f.json"}}, {"family": "Singh Gaur", "given": "Prakash", "initials": "P", "orcid": "0000-0003-3272-628X", "researcher": {"href": "https://publications.scilifelab.se/researcher/6e1090dcab7e4216960cf7798523179f.json"}}, {"family": "Bellis", "given": "Louisa J", "initials": "LJ", "orcid": "0000-0001-9581-870X", "researcher": {"href": "https://publications.scilifelab.se/researcher/865db60afe8b400aaf33d1c9c8648634.json"}}, {"family": "Larcombe", "given": "Lee", "initials": "L", "orcid": "0000-0003-3150-6445", "researcher": {"href": "https://publications.scilifelab.se/researcher/79426101140d4588bb672e154e5590e2.json"}}, {"family": "Alloza", "given": "Eva", "initials": "E", "orcid": "0000-0001-8385-9336", "researcher": {"href": "https://publications.scilifelab.se/researcher/a3c0c2cdf63e4276a9d843565613e40a.json"}}, {"family": "Balint", "given": "Balint Laszlo", "initials": "BL"}, {"family": "Botzki", "given": "Alexander", "initials": "A", "orcid": "0000-0001-6691-4233", "researcher": {"href": "https://publications.scilifelab.se/researcher/79448fc432054ea6af1a0c559b28567d.json"}}, {"family": "Dimec", "given": "Jure", "initials": "J", "orcid": "0000-0002-9525-9028", "researcher": {"href": "https://publications.scilifelab.se/researcher/aec05bd8a2734f6aa26fb489fce9c630.json"}}, {"family": "Dominguez Del Angel", "given": "Victoria", "initials": "V", "orcid": "0000-0002-5514-6651", "researcher": {"href": "https://publications.scilifelab.se/researcher/9a708fa6e9a44453b97cfc2089e06bec.json"}}, {"family": "Fernandes", "given": "Pedro L", "initials": "PL", "orcid": "0000-0003-2124-0241", "researcher": {"href": "https://publications.scilifelab.se/researcher/e3ca6e16ecb5403f8e344f1df671fa4a.json"}}, {"family": "Korpelainen", "given": "Eija", "initials": "E"}, {"family": "Krause", "given": "Roland", "initials": "R", "orcid": "0000-0001-9938-7126", "researcher": {"href": "https://publications.scilifelab.se/researcher/86196693e4fd402ab0493246d07d5ccb.json"}}, {"family": "Kuzak", "given": "Mateusz", "initials": "M", "orcid": "0000-0003-0087-6021", "researcher": {"href": "https://publications.scilifelab.se/researcher/821ca0fb14794a349ac838ab4a7abc1c.json"}}, {"family": "Le Pera", "given": "Loredana", "initials": "L", "orcid": "0000-0002-0076-9878", "researcher": {"href": "https://publications.scilifelab.se/researcher/0371f0fbbbbe4c0d8491e17d8aa8424a.json"}}, {"family": "Lesko\u0161ek", "given": "Brane", "initials": "B", "orcid": "0000-0001-5202-2349", "researcher": {"href": "https://publications.scilifelab.se/researcher/4d638a63af2043d2b7f8495efbd6b078.json"}}, {"family": "Lindvall", "given": "Jessica M", "initials": "JM", "orcid": "0000-0002-5042-8481", "researcher": {"href": "https://publications.scilifelab.se/researcher/78debae1bc714b11a97ecf9e9656f1eb.json"}}, {"family": "Marek", "given": "Diana", "initials": "D", "orcid": "0000-0002-9812-6351", "researcher": {"href": "https://publications.scilifelab.se/researcher/f1717147a0594eb6868f3f58a4b849e5.json"}}, {"family": "Martinez", "given": "Paula A", "initials": "PA", "orcid": "0000-0002-8990-1985", "researcher": {"href": "https://publications.scilifelab.se/researcher/25c618ae300f484bb3f589fd55eea782.json"}}, {"family": "Muyldermans", "given": "Tuur", "initials": "T", "orcid": "0000-0002-3926-7293", "researcher": {"href": "https://publications.scilifelab.se/researcher/be84fdc9e5d5439484f038bd28e0b97a.json"}}, {"family": "Nyg\u00e5rd", "given": "St\u00e5le", "initials": "S"}, {"family": "Palagi", "given": "Patricia M", "initials": "PM", "orcid": "0000-0001-9062-6303", "researcher": {"href": "https://publications.scilifelab.se/researcher/855d6f5c9a994ae49495522dfaf36281.json"}}, {"family": "Peterson", "given": "Hedi", "initials": "H", "orcid": "0000-0001-9951-5116", "researcher": {"href": "https://publications.scilifelab.se/researcher/be512162189d47059f1bd0755d6c16f8.json"}}, {"family": "Psomopoulos", "given": "Fotis", "initials": "F", "orcid": "0000-0002-0222-4273", "researcher": {"href": "https://publications.scilifelab.se/researcher/b4fd426018ab421c99e9044b31066cf9.json"}}, {"family": "Spiwok", "given": "Vojtech", "initials": "V", "orcid": "0000-0001-8108-2033", "researcher": {"href": "https://publications.scilifelab.se/researcher/1e89722bc45a4656906c79ba958cbc03.json"}}, {"family": "van Gelder", "given": "Celia W G", "initials": "CWG", "orcid": "0000-0002-0223-2329", "researcher": {"href": "https://publications.scilifelab.se/researcher/2533ac3cd3274486b5609ac643dc436b.json"}}, {"family": "Via", "given": "Allegra", "initials": "A"}, {"family": "Vidak", "given": "Marko", "initials": "M", "orcid": "0000-0001-7901-3936", "researcher": {"href": "https://publications.scilifelab.se/researcher/be1d0da74ff64f4a8ec7393577f9ba81.json"}}, {"family": "Wibberg", "given": "Daniel", "initials": "D", "orcid": "0000-0002-1331-4311", "researcher": {"href": "https://publications.scilifelab.se/researcher/771c66b20cb448e481140f476c6244ca.json"}}, {"family": "Morgan", "given": "Sarah L", "initials": "SL", "orcid": "0000-0001-9528-8323", "researcher": {"href": "https://publications.scilifelab.se/researcher/062588ade1a843ae846c4e2b96904849.json"}}, {"family": "Rustici", "given": "Gabriella", "initials": "G", "orcid": "0000-0003-3085-1271", "researcher": {"href": "https://publications.scilifelab.se/researcher/82710a270ab04a09af1550709db32b1c.json"}}], "type": "journal article", "published": "2020-07-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "volume": "16", "issue": "7", "pages": "e1007976", "issn-l": "1553-734X"}, "abstract": "ELIXIR is a pan-European intergovernmental organisation for life science that aims to coordinate bioinformatics resources in a single infrastructure across Europe; bioinformatics training is central to its strategy, which aims to develop a training community that spans all ELIXIR member states. In an evidence-based approach for strengthening bioinformatics training programmes across Europe, the ELIXIR Training Platform, led by the ELIXIR EXCELERATE Quality and Impact Assessment Subtask in collaboration with the ELIXIR Training Coordinators Group, has implemented an assessment strategy to measure quality and impact of its entire training portfolio. Here, we present ELIXIR's framework for assessing training quality and impact, which includes the following: specifying assessment aims, determining what data to collect in order to address these aims, and our strategy for centralised data collection to allow for ELIXIR-wide analyses. In addition, we present an overview of the ELIXIR training data collected over the past 4 years. We highlight the importance of a coordinated and consistent data collection approach and the relevance of defining specific metrics and answer scales for consortium-wide analyses as well as for comparison of data across iterations of the same course.", "doi": "10.1371/journal.pcbi.1007976", "pmid": "32702016", "labels": {"Bioinformatics Support and Infrastructure": null, "Bioinformatics Support, Infrastructure and Training": null, "Bioinformatics (NBIS)": null}, "xrefs": [{"db": "pii", "key": "PCOMPBIOL-D-19-01778"}, {"db": "pmc", "key": "PMC7377377"}], "notes": [], "created": "2020-12-15T09:16:34.028Z", "modified": "2021-11-10T12:49:57.221Z"}, {"entity": "publication", "iuid": "55e2c58b441b4c1aa01a4628da14281d", "links": {"self": {"href": "https://publications.scilifelab.se/publication/55e2c58b441b4c1aa01a4628da14281d.json"}, "display": {"href": "https://publications.scilifelab.se/publication/55e2c58b441b4c1aa01a4628da14281d"}}, "title": "Transcriptomic correlates of electrophysiological and morphological diversity within and across excitatory and inhibitory neuron classes.", "authors": [{"family": "Bomkamp", "given": "Claire", "initials": "C", "orcid": "0000-0002-7259-5248", "researcher": {"href": "https://publications.scilifelab.se/researcher/02eeffa98ae44ba7bd50a684b8b8374e.json"}}, {"family": "Tripathy", "given": "Shreejoy J", "initials": "SJ", "orcid": "0000-0002-1007-9061", "researcher": {"href": "https://publications.scilifelab.se/researcher/90d74d11894e45a8ad5dea1417ce7cc3.json"}}, {"family": "Bengtsson Gonzales", "given": "Carolina", "initials": "C", "orcid": "0000-0003-4084-2125", "researcher": {"href": "https://publications.scilifelab.se/researcher/11606ef2d7bb4c9789b5e17e66a78cf7.json"}}, {"family": "Hjerling-Leffler", "given": "Jens", "initials": "J", "orcid": "0000-0002-4539-1776", "researcher": {"href": "https://publications.scilifelab.se/researcher/51675f0ff9aa47d89d6b2eb84a14820a.json"}}, {"family": "Craig", "given": "Ann Marie", "initials": "AM"}, {"family": "Pavlidis", "given": "Paul", "initials": "P", "orcid": "0000-0002-0426-5028", "researcher": {"href": "https://publications.scilifelab.se/researcher/b3b8b29d6ff84d6eb75b37739405dbed.json"}}], "type": "journal article", "published": "2019-06-00", "journal": {"title": "PLoS Comput. Biol.", "issn": "1553-7358", "volume": "15", "issue": "6", "pages": "e1007113", "issn-l": "1553-734X"}, "abstract": "In order to further our understanding of how gene expression contributes to key functional properties of neurons, we combined publicly accessible gene expression, electrophysiology, and morphology measurements to identify cross-cell type correlations between these data modalities. Building on our previous work using a similar approach, we distinguished between correlations which were \"class-driven,\" meaning those that could be explained by differences between excitatory and inhibitory cell classes, and those that reflected graded phenotypic differences within classes. Taking cell class identity into account increased the degree to which our results replicated in an independent dataset as well as their correspondence with known modes of ion channel function based on the literature. We also found a smaller set of genes whose relationships to electrophysiological or morphological properties appear to be specific to either excitatory or inhibitory cell types. Next, using data from PatchSeq experiments, allowing simultaneous single-cell characterization of gene expression and electrophysiology, we found that some of the gene-property correlations observed across cell types were further predictive of within-cell type heterogeneity. In summary, we have identified a number of relationships between gene expression, electrophysiology, and morphology that provide testable hypotheses for future studies.", "doi": "10.1371/journal.pcbi.1007113", "pmid": "31211786", "labels": {"Eukaryotic Single Cell Genomics (ESCG)": "Service"}, "xrefs": [{"db": "pii", "key": "PCOMPBIOL-D-19-00128"}, {"db": "pmc", "key": "PMC6599125"}], "notes": [], "created": "2019-08-23T08:14:17.272Z", "modified": "2021-06-21T10:00:09.444Z"}, {"entity": "publication", "iuid": "98d213251a244250a7b81bdc1ffb591c", "links": {"self": {"href": "https://publications.scilifelab.se/publication/98d213251a244250a7b81bdc1ffb591c.json"}, "display": {"href": "https://publications.scilifelab.se/publication/98d213251a244250a7b81bdc1ffb591c"}}, "title": "RAVEN 2.0: A versatile toolbox for metabolic network reconstruction and a case study on Streptomyces coelicolor.", "authors": [{"family": "Wang", "given": "Hao", "initials": "H", "orcid": "0000-0001-7475-0136", "researcher": {"href": "https://publications.scilifelab.se/researcher/836b4fbf7ebd4f80abc84465c8f29a2e.json"}}, {"family": "Marci\u0161auskas", "given": "Simonas", "initials": "S", "orcid": "0000-0003-0770-6383", "researcher": {"href": "https://publications.scilifelab.se/researcher/fc39e10944224b08b91ba4f0c3382be5.json"}}, {"family": "S\u00e1nchez", "given": "Benjam\u00edn J", "initials": "BJ", "orcid": "0000-0001-6093-4110", "researcher": {"href": "https://publications.scilifelab.se/researcher/def9067db1624ff49e94cc67ba5d4208.json"}}, {"family": "Domenzain", "given": "Iv\u00e1n", "initials": "I", "orcid": "0000-0002-5322-2040", "researcher": {"href": "https://publications.scilifelab.se/researcher/3793e87625584ee2a31301297263a12a.json"}}, {"family": "Hermansson", "given": "Daniel", "initials": "D"}, {"family": "Agren", "given": "Rasmus", "initials": "R"}, {"family": "Nielsen", "given": "Jens", "initials": "J", "orcid": "0000-0002-9955-6003", "researcher": {"href": "https://publications.scilifelab.se/researcher/7a596e289be4438a8a2653b1f25fea8b.json"}}, {"family": "Kerkhoven", "given": "Eduard J", "initials": "EJ", "orcid": "0000-0002-3593-5792", "researcher": {"href": "https://publications.scilifelab.se/researcher/0df361f8014144e79479631fcbffad53.json"}}], "type": "journal article", "published": "2018-10-00", "journal": {"volume": "14", "issn": "1553-7358", "issue": "10", "pages": "e1006541", "title": "PLoS Comput. Biol.", "issn-l": "1553-734X"}, "abstract": "RAVEN is a commonly used MATLAB toolbox for genome-scale metabolic model (GEM) reconstruction, curation and constraint-based modelling and simulation. Here we present RAVEN Toolbox 2.0 with major enhancements, including: (i) de novo reconstruction of GEMs based on the MetaCyc pathway database; (ii) a redesigned KEGG-based reconstruction pipeline; (iii) convergence of reconstructions from various sources; (iv) improved performance, usability, and compatibility with the COBRA Toolbox. Capabilities of RAVEN 2.0 are here illustrated through de novo reconstruction of GEMs for the antibiotic-producing bacterium Streptomyces coelicolor. Comparison of the automated de novo reconstructions with the iMK1208 model, a previously published high-quality S. coelicolor GEM, exemplifies that RAVEN 2.0 can capture most of the manually curated model. The generated de novo reconstruction is subsequently used to curate iMK1208 resulting in Sco4, the most comprehensive GEM of S. coelicolor, with increased coverage of both primary and secondary metabolism. This increased coverage allows the use of Sco4 to predict novel genome editing targets for optimized secondary metabolites production. As such, we demonstrate that RAVEN 2.0 can be used not only for de novo GEM reconstruction, but also for curating existing models based on up-to-date databases. Both RAVEN 2.0 and Sco4 are distributed through GitHub to facilitate usage and further development by the community (https://github.com/SysBioChalmers/RAVEN and https://github.com/SysBioChalmers/Streptomyces_coelicolor-GEM).", "doi": "10.1371/journal.pcbi.1006541", "pmid": "30335785", "labels": {"Systems Biology": "Technology development", "Bioinformatics Support, Infrastructure and Training": "Technology development", "Bioinformatics (NBIS)": "Technology development"}, "xrefs": [{"db": "pii", "key": "PCOMPBIOL-D-18-00741"}, {"db": "pmc", "key": "PMC6207324"}, {"db": "figshare", "key": "10.6084/m9.figshare.6236903"}], "notes": [], "created": "2019-01-10T09:15:57.610Z", "modified": "2021-06-21T14:02:31.713Z"}, {"entity": "publication", "iuid": "b95add923913483c89ea2e12850ab207", "links": {"self": {"href": "https://publications.scilifelab.se/publication/b95add923913483c89ea2e12850ab207.json"}, "display": {"href": "https://publications.scilifelab.se/publication/b95add923913483c89ea2e12850ab207"}}, "title": "Fast and general tests of genetic interaction for genome-wide association studies", "authors": [{"family": "Fr\u00e5nberg", "given": "Mattias", "initials": "M"}, {"family": "Strawbridge", "given": "Rona J", "initials": "RJ"}, {"family": "Hamsten", "given": "Anders", "initials": "A"}, {"family": "de Faire", "given": "Ulf", "initials": "U"}, {"family": "Lagergren", "given": "Jens", "initials": "J"}, {"family": "Sennblad", "given": "Bengt", "initials": "B"}, {"family": null, "given": "", "initials": ""}], "type": "journal-article", "published": "2017-06-06", "journal": {"volume": "13", "issn": "1553-7358", "issue": "6", "pages": "e1005556", "title": "PLoS Comput. Biol.", "issn-l": "1553-734X"}, "abstract": null, "doi": "10.1371/journal.pcbi.1005556", "pmid": "28586362", "labels": {"National Genomics Infrastructure": "Service", "NGI Uppsala (SNP&SEQ Technology Platform)": "Service"}, "xrefs": [], "notes": [], "created": "2018-01-09T13:58:10.227Z", "modified": "2020-01-21T13:56:11.767Z"}, {"entity": "publication", "iuid": "9d31db38ff3e46ff85f710dfcf4c086c", "links": {"self": {"href": "https://publications.scilifelab.se/publication/9d31db38ff3e46ff85f710dfcf4c086c.json"}, "display": {"href": "https://publications.scilifelab.se/publication/9d31db38ff3e46ff85f710dfcf4c086c"}}, "title": "High GC content causes orphan proteins to be intrinsically disordered.", "authors": [{"family": "Basile", "given": "Walter", "initials": "W"}, {"family": "Sachenkova", "given": "Oxana", "initials": "O"}, {"family": "Light", "given": "Sara", "initials": "S"}, {"family": "Elofsson", "given": "Arne", "initials": "A"}], "type": "journal article", "published": "2017-03-00", "journal": {"volume": "13", "issn": "1553-7358", "issue": "3", "pages": "e1005375", "title": "PLoS Comput. Biol.", "issn-l": "1553-734X"}, "abstract": "De novo creation of protein coding genes involves the formation of short ORFs from noncoding regions; some of these ORFs might then become fixed in the population. These orphan proteins need to, at the bare minimum, not cause serious harm to the organism, meaning that they should for instance not aggregate. Therefore, although the creation of short ORFs could be truly random, the fixation should be subjected to some selective pressure. The selective forces acting on orphan proteins have been elusive, and contradictory results have been reported. In Drosophila young proteins are more disordered than ancient ones, while the opposite trend is present in yeast. To the best of our knowledge no valid explanation for this difference has been proposed. To solve this riddle we studied structural properties and age of proteins in 187 eukaryotic organisms. We find that, with the exception of length, there are only small differences in the properties between proteins of different ages. However, when we take the GC content into account we noted that it could explain the opposite trends observed for orphans in yeast (low GC) and Drosophila (high GC). GC content is correlated with codons coding for disorder promoting amino acids. This leads us to propose that intrinsic disorder is not a strong determining factor for fixation of orphan proteins. Instead these proteins largely resemble random proteins given a particular GC level. During evolution the properties of a protein change faster than the GC level causing the relationship between disorder and GC to gradually weaken.", "doi": "10.1371/journal.pcbi.1005375", "pmid": "28355220", "labels": {"Bioinformatics Support, Infrastructure and Training": "Collaborative", "Bioinformatics Support and Infrastructure": "Collaborative", "Bioinformatics (NBIS)": "Collaborative"}, "xrefs": [{"db": "pii", "key": "PCOMPBIOL-D-16-01242"}, {"db": "pmc", "key": "PMC5389847"}], "notes": [], "created": "2017-11-10T13:08:08.198Z", "modified": "2020-01-21T13:53:21.871Z"}, {"entity": "publication", "iuid": "4bfa34257a69416590d769c5a7af7e95", "links": {"self": {"href": "https://publications.scilifelab.se/publication/4bfa34257a69416590d769c5a7af7e95.json"}, "display": {"href": "https://publications.scilifelab.se/publication/4bfa34257a69416590d769c5a7af7e95"}}, "title": "Transcriptome profiling of Giardia intestinalis using strand-specific RNA-seq.", "authors": [{"family": "Franz\u00e9n", "given": "Oscar", "initials": "O"}, {"family": "Jerlstr\u00f6m-Hultqvist", "given": "Jon", "initials": "J"}, {"family": "Einarsson", "given": "Elin", "initials": "E"}, {"family": "Ankarklev", "given": "Johan", "initials": "J", "orcid": "0000-0003-3170-8493", "researcher": {"href": "https://publications.scilifelab.se/researcher/dcf5386930e34157bf76970b16bffc02.json"}}, {"family": "Ferella", "given": "Marcela", "initials": "M"}, {"family": "Andersson", "given": "Bj\u00f6rn", "initials": "B"}, {"family": "Sv\u00e4rd", "given": "Staffan G", "initials": "SG"}], "type": "journal article", "published": "2013-03-28", "journal": {"volume": "9", "issn": "1553-7358", "issue": "3", "pages": "e1003000", "title": "PLoS Comput. Biol.", "issn-l": "1553-734X"}, "abstract": "Giardia intestinalis is a common cause of diarrheal disease and it consists of eight genetically distinct genotypes or assemblages (A-H). Only assemblages A and B infect humans and are suggested to represent two different Giardia species. Correlations exist between assemblage type and host-specificity and to some extent symptoms. Phenotypical differences have been documented between assemblages and genome sequences are available for A, B and E. We have characterized and compared the polyadenylated transcriptomes of assemblages A, B and E. Four genetically different isolates were studied (WB (AI), AS175 (AII), P15 (E) and GS (B)) using paired-end, strand-specific RNA-seq. Most of the genome was transcribed in trophozoites grown in vitro, but at vastly different levels. RNA-seq confirmed many of the present annotations and refined the current genome annotation. Gene expression divergence was found to recapitulate the known phylogeny, and uncovered lineage-specific differences in expression. Polyadenylation sites were mapped for over 70% of the genes and revealed many examples of conserved and unexpectedly long 3' UTRs. 28 open reading frames were found in a non-transcribed gene cluster on chromosome 5 of the WB isolate. Analysis of allele-specific expression revealed a correlation between allele-dosage and allele expression in the GS isolate. Previously reported cis-splicing events were confirmed and global mapping of cis-splicing identified only one novel intron. These observations can possibly explain differences in host-preference and symptoms, and it will be the basis for further studies of Giardia pathogenesis and biology.", "doi": "10.1371/journal.pcbi.1003000", "pmid": "23555231", "labels": {"National Genomics Infrastructure": null, "NGI Uppsala (SNP&SEQ Technology Platform)": null}, "xrefs": [{"db": "pii", "key": "PCOMPBIOL-D-12-01572"}, {"db": "pmc", "key": "PMC3610916"}, {"db": "GEO", "key": "GSE36490"}], "notes": [], "created": "2017-05-04T15:01:26.084Z", "modified": "2021-07-07T13:58:48.946Z"}, {"entity": "publication", "iuid": "a63099587354485e85c31f1d7da135b0", "links": {"self": {"href": "https://publications.scilifelab.se/publication/a63099587354485e85c31f1d7da135b0.json"}, "display": {"href": "https://publications.scilifelab.se/publication/a63099587354485e85c31f1d7da135b0"}}, "title": "Integrative approach to pain genetics identifies pain sensitivity loci across diseases.", "authors": [{"family": "Ruau", "given": "David", "initials": "D"}, {"family": "Dudley", "given": "Joel T", "initials": "JT"}, {"family": "Chen", "given": "Rong", "initials": "R"}, {"family": "Phillips", "given": "Nicholas G", "initials": "NG"}, {"family": "Swan", "given": "Gary E", "initials": "GE"}, {"family": "Lazzeroni", "given": "Laura C", "initials": "LC"}, {"family": "Clark", "given": "J David", "initials": "JD"}, {"family": "Butte", "given": "Atul J", "initials": "AJ"}, {"family": "Angst", "given": "Martin S", "initials": "MS"}], "type": "journal article", "published": "2012-06-07", "journal": {"volume": "8", "issn": "1553-7358", "issue": "6", "pages": "e1002538", "title": "PLoS Comput. Biol.", "issn-l": "1553-734X"}, "abstract": "Identifying human genes relevant for the processing of pain requires difficult-to-conduct and expensive large-scale clinical trials. Here, we examine a novel integrative paradigm for data-driven discovery of pain gene candidates, taking advantage of the vast amount of existing disease-related clinical literature and gene expression microarray data stored in large international repositories. First, thousands of diseases were ranked according to a disease-specific pain index (DSPI), derived from Medical Subject Heading (MESH) annotations in MEDLINE. Second, gene expression profiles of 121 of these human diseases were obtained from public sources. Third, genes with expression variation significantly correlated with DSPI across diseases were selected as candidate pain genes. Finally, selected candidate pain genes were genotyped in an independent human cohort and prospectively evaluated for significant association between variants and measures of pain sensitivity. The strongest signal was with rs4512126 (5q32, ABLIM3, P\u200a=\u200a1.3\u00d710\u207b\u00b9\u2070) for the sensitivity to cold pressor pain in males, but not in females. Significant associations were also observed with rs12548828, rs7826700 and rs1075791 on 8q22.2 within NCALD (P\u200a=\u200a1.7\u00d710\u207b\u2074, 1.8\u00d710\u207b\u2074, and 2.2\u00d710\u207b\u2074 respectively). Our results demonstrate the utility of a novel paradigm that integrates publicly available disease-specific gene expression data with clinical data curated from MEDLINE to facilitate the discovery of pain-relevant genes. This data-derived list of pain gene candidates enables additional focused and efficient biological studies validating additional candidates.", "doi": "10.1371/journal.pcbi.1002538", "pmid": "22685391", "labels": {"Mutation Analysis Facility (MAF)": null}, "xrefs": [{"db": "pii", "key": "PCOMPBIOL-D-12-00028"}, {"db": "pmc", "key": "PMC3369906"}], "notes": [], "created": "2017-05-04T15:03:34.063Z", "modified": "2017-05-30T14:52:27.480Z"}], "created": "2017-05-09T09:12:51.417Z", "modified": "2020-11-27T13:14:05.369Z"}