Summary

Input

The {output_dir} after mapping.

Generate MappingSummary

Once all the snakemake commands finished successfully, you can execute a single command to get the final mapping summary of your library.

$ yap summary -h
usage: yap summary [-h] --output_dir OUTPUT_DIR [--notebook NOTEBOOK]

optional arguments:
  -h, --help            show this help message and exit

Required inputs:
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                        Pipeline output directory. (default: None)
  --notebook NOTEBOOK, -nb NOTEBOOK
                        Notebook template for mapping summary, if not
                        provided, will use yap default template. (default:
                        None)

This command will do two things

  1. It creates a total mapping summary file that contains all the mapping metrics for all cells in your library. This file will be saved into {output_dir}/stats/MappingSummary.csv.gz

  2. It uses a notebook template (or you can provide one by yourself) to generate some plots for visualizing some of the key mapping metrics.

You can customize the default notebook template (the one generated by yap summary) and provided it to yap summary next time.

Evaluate Data Quality

In the mapping summary table, I provide dozens of metrics (as much as I can) for every cell. Many of these metrics are specific and redundant. To evaluate cell quality (for filtering before analysis), we only need some of the key mapping metrics. Other metrics might be useful for troubleshooting when the library has some quality issues.

Output

  1. The mapping summary table generated at {output_dir}/stats/MappingSummary.csv.gz

  2. Jupyter notebook and HTML file of some visualization of the mapping metrics, also located in the same directory. The plate view uses the cell's original 384-plate position as coordinates and color code mapping metrics.

  3. See detailed annotation of mapping metrics on the next page.

Last updated