4.Normalization
Last updated
Last updated
Sparsity of data and technical noise ("batch effects") --> will mask the signal of interest
Causes:
RNA content (total amount and species) varies
for mRNAs:
92 ERCC molecules
8 mRNAs
whole transcriptome HeLa RNAs
for sRNAs:
52(?) sRNA sequences
Caveat:
Typically only half of the spike-in were detected.
for Single cell RNA-seq (and exRNA-seq)
scran:
pools multiple cells (samples) in order to estimate cell-specific size factors in the presence of zero inflation and unbalanced differential expression of genes across groups of cells;
precluster (using e.g. rank-based clustering) the cells into smaller, more homogeneous sets
SCnorm
Census
If considering spike-ins:
SAMstrt
GRM
See more about normalization, imputation and confounder (e.g. batch effect) in
Additional Tutorial : 4.QC and Normalization; 5. Imputation and Confounders