single-cell data
Last updated
Last updated
Single-cell data analysis and visualization can be separated into two levels: 1) cell level; 2) cluster or pseudo-bulk level. The second level is much like bulk-seq analysis, and the first level is technology-specific. Here in this book, I provide you a summary table from my project. In this project, we sequenced >100,000 single nuclei from 45 adult mouse brain regions using our single-nuclei methylome sequencing technology (Original paper and updated version).
I will not go through the analysis details because its a rather large topic that's beyond the scope of this book. Check this manuscript (ADD LINK) if you are interested. In this book, you can imagine that this dataset is equal to any other single-cell datasets, such as scRNA-seq, scATAC-seq, etc.
The single-cell data I used here is a reduced dataset that's basically a large table, where each row is a cell's information, each column is a kind of metadata/computed variable. This dataset by no means covered all aspects of single-cell data analysis but just trying to emphasize on data visualization principles.
In order to do great data visualization, you first need to be skilled on data cleaning. Because the first step of any data visualization is to reformat/summarize your data into a format that's suitable for your visualization purpose. Therefore, here I use this complex single-cell table, and show you step by step how those beautiful figures from single-cell papers are made from scratch.
The single-cell analysis part is about how this table is generated, which is not covered, but I provide some resources if you are interested (warning: that's a large topic).
I will remake all these figures from scratch, all of them are purely plotted with python.
How to visually map a variable
How to control color, size, and other aspects of the scatter plot
How simple plot type (pie chart) can be combined into a fancy new plot
This plot needs a lot data cleaning techniques
How to control panel layout, combine multiple plots into same panel
How to build a tree (or any other structure) from scratch
All files are included in the Github repository of this book.
Although not covered in this book, here, I recommend some resources to learn more about single-cell RNA-seq data analysis. For epigenomic data, such as DNA methylation, I plan to write a detailed book in the future.
Scanpy, the most popular python package for single-cell data analysis
Single-cell RNA-seq data analysis tutorial from Thesis lab
Seurat, the most popular R package for single-cell data analysis