Key Mapping Metrics
Important mapping metrics for evaluating cell quality
snmC-seq 2/3
Below is the minimum mapping metric I used to evaluate cell and library quality before any computational analysis is done. The numbers here related to how the library is sequenced, I gave this number based on loading 16 plates (3072 wells) in a MiSeq run or a NovaSeq run using S4 flowcell. If your library is loaded differently (e.g., 32 plates in a NovaSeq run using S4 flowcell), you need to change the cutoffs accordingly.
FinalmCReads, this is the final number of reads used in methylation calling, therefore, represents the real genome coverage.
MiSeq cutoff: FinalmCReads > 100
NovaSeq cutoff: FinalmCReads > 500,000
The library average is ~1.6 M reads/cell.
mCCCFrac, this is the upper bound of non-conversion rate. mCCC fraction is usually close to the non-conversion rate measured by lambda DNA spike-in, but it is positively correlated with the cell's mCH fraction, therefore, can be a bit higher in cells with high mCH (e.g., some inhibitory neurons). Therefore, I recommend using different thresholds for neurons and other tissues:
For neuronal related sample, use mCCC fraction < 0.03
For other tissue that known to have low mCH fraction, use mCCC fraction < 0.01
R1MappingRate, this metric is species-specific, the library average usually between 65% - 75%. I use R1MappingRate > 50% as the cutoff. A low mapping rate indicates potential contamination.
R2MappingRate, this metric is lower than R1MappingRate, because the R2 base quality is not as good as R1, the average is usually 10% lower than R1 (but highly correlated).
R1(R2)DuplicationRate, the library average usually between 25% - 35%. I do not filter cells based on this metric.
Overall success rate: after filtering by FinalmCReads, mCCCFrac, R1MappingRate, we usually got ~80% wells (or cells) remaining. The success rate between MiSeq and NovaSeq should be very close. If the MiSeq success rate is below 65% (< 2000 success in a total of 3072), I do not recommend proceeding to NovaSeq. There must be some quality issues either during FACS or due to the library preparation.
🚧 snmCT-seq
🚧 snm3C-seq
Last updated