Abstract
Background: Cell lines and cell types are extensively studied in biomedical research yielding to a significant amount of publications each year. Identifying cell lines and cell types precisely in publications is crucial for science reproducibility and knowledge integration. There are efforts for standardisation of the cell nomenclature based on ontology development to support FAIR principles of the cell knowledge. However, it is important to analyse the usage of cell nomenclature in publications at a large scale for understanding the level of uptake of cell nomenclature in literature by scientists. In this study, we analyse the usage of cell nomenclature, both in Vivo, and in Vitro in biomedical literature by using text mining methods and present our results. Results: We identified 59% of the cell type classes in the Cell Ontology and 13% of the cell line classes in the Cell Line Ontology in the literature. Our analysis showed that cell line nomenclature is much more ambiguous compared to the cell type nomenclature. However, trends indicate that standardised nomenclature for cell lines and cell types are being increasingly used in publications by the scientists. Conclusions: Our findings provide an insight to understand how experimental cells are described in publications and may allow for an improved standardisation of cell type and cell line nomenclature as well as can be utilised to develop efficient text mining applications on cell types and cell lines. All data generated in this study is available at https://github.com/shenay/CellNomenclatureStudy.
Original language | English (US) |
---|---|
Article number | 561 |
Journal | BMC BIOINFORMATICS |
Volume | 18 |
DOIs | |
State | Published - Dec 21 2017 |
Keywords
- Cell lines
- Cell nomenclature
- Cell types
- Ontologies
- Text mining
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Applied Mathematics