Single cell profiling techniques such as single cell sequencing and cytometry are powerful for comprehensive and high-resolution characterization of the cellular heterogeneities observed in tumors, brain, and other tissues. The identification and assignment of cell types from the pool of profiled cells is the first step of data analysis involving scRNA-seq or cytometry data. To achieve this goal, we developed the SCINA algorithm, short for Semi-supervised category identification and assignment. SCINA is originally designed to assign cell types based on single cell RNA-seq data. However, SCINA is also general and can be applied in other scenarios when data of similar format are available, such as patient bulk tumor RNA-seq data.
SCINA leverages prior reference information and simultaneously performs cell type detection and assignment for known cell types.
SCINA is able to define novel unknown cell types, whose exact identities can be determined in follow-up studies.
SCINA represents a “signature-to-category” approach, which is complementary to traditional “category-to-signature” approaches like t-SNE and unsupervised clustering.