Chapter Contents
Scope of Writing
I want to write a practical coding book, mainly talking about how to prepare the data for analysis, how to handle analysis results for visualization. I will not explain the theory of analysis in detail (e.g., statistics, machine learning, or biological aspects), but I will provide materials when I mention them. I do hope to keep this brief and straightforward, not to write an encyclopedia of genome science analysis 🧐.
Chapter 1. Understand Raw Data
In this chapter, I will talk about process of generating the "raw" data from the Illumina sequencing, with an emphasize on the universal principles of different technologies. I will also introduce the datasets used throughout the book:
A bulk RNA-seq dataset from ENCODE
A single-cell RNA-seq dataset from Allen Brain Institute
A single-cell snmC-seq2 dataset from my research project
Chapter 2. Work environment
In this chapter, I will imagine if I have a new computer or server account, the steps of setting up my work environment.
How to install python and all genomic science tools/packages/software?
How to do data analysis on the jupyter notebook/lab?
Some tips on the system/shell level
Chapter 3. Data Cleaning
Chapter 4. Genome Science Data
In this chapter, I will explain the genome science data format in detail. I will also introduce essential tools associated with each data format, including their python versions!
Chapter 5. Python Basics
In this chapter, I will summarize critical concepts related to the python language, such as "pointer", "everything is an object". I will also list some language skills that prettify your code and significantly improve your efficiency.
Chapter 6. Data Visualization
In this chapter, I will mainly talk about the matplotlib and seaborn package to make publication level figures. I will explain the matplotlib package in detail, and reproduce complex figures line by line from scratch.
Chapter 7. Use R in Python
In this chapter, I will talk about using rpy2 to integrate useful R packages into python.
Last updated
Was this helpful?