Lindman K, Rose JF, Lindvall M, Lundström C, Treanor D
J Pathol Inform 10 (-) 22 [2019-07-23; online 2019-07-23]
Digital pathology is today a widely used technology, and the digitalization of microscopic slides into whole slide images (WSIs) allows the use of machine learning algorithms as a tool in the diagnostic process. In recent years, "deep learning" algorithms for image analysis have been applied to digital pathology with great success. The training of these algorithms requires a large volume of high-quality images and image annotations. These large image collections are a potent source of information, and to use and share the information, standardization of the content through a consistent terminology is essential. The aim of this project was to develop a pilot dataset of exhaustive annotated WSI of normal and abnormal human tissue and link the annotations to appropriate ontological information. Several biomedical ontologies and controlled vocabularies were investigated with the aim of selecting the most suitable ontology for this project. The selection criteria required an ontology that covered anatomical locations, histological subcompartments, histopathologic diagnoses, histopathologic terms, and generic terms such as normal, abnormal, and artifact. WSIs of normal and abnormal tissue from 50 colon resections and 69 skin excisions, diagnosed 2015-2016 at the Department of Clinical Pathology in Linköping, were randomly collected. These images were manually and exhaustively annotated at the level of major subcompartments, including normal or abnormal findings and artifacts. Systemized nomenclature of medicine clinical terms (SNOMED CT) was chosen, and the annotations were linked to its codes and terms. Two hundred WSI were collected and annotated, resulting in 17,497 annotations, covering a total area of 302.19 cm 2, equivalent to 107,7 gigapixels. Ninety-five unique SNOMED CT codes were used. The time taken to annotate a WSI varied from 45 s to over 360 min, a total time of approximately 360 h. This work resulted in a dataset of 200 exhaustive annotated WSIs of normal and abnormal tissue from the colon and skin, and it has informed plans to build a comprehensive library of annotated WSIs. SNOMED CT was found to be the best ontology for annotation labeling. This project also demonstrates the need for future development of annotation tools in order to make the annotation process more efficient.