Single-Cell Normalization

single-cell

Published

May 7, 2026

Purpose

Normalization adjusts raw count data so cells can be compared more fairly.

In scRNA-seq, different cells can have different total UMI counts because of capture efficiency, sequencing depth, cell size, or RNA content. Normalization reduces this technical difference before downstream steps.

This page only covers normalization. Variable feature selection, scaling, PCA, and integration are separate steps.

What Normalization Does

The usual Seurat default is log-normalization:

raw counts -> divide by total counts per cell -> multiply by scale factor -> log1p transform

Conceptually:

normalized value = log1p(count / total_counts_per_cell * scale_factor)

This creates normalized expression values in the data slot of the RNA assay.

Normalization Methods

NormalizeData() supports several normalization methods.

Method	Common Use	Meaning
`LogNormalize`	default for RNA	normalize by total counts, multiply by scale factor, then `log1p`
`CLR`	often used for ADT in CITE-seq	centered log-ratio normalization
`RC`	relative count normalization	normalize by total counts without log transformation

For basic scRNA-seq RNA analysis, LogNormalize is the usual starting point.

For CITE-seq ADT data, CLR is commonly used:

seu <- Seurat::NormalizeData(
  object = seu,              # Seurat object
  assay = "ADT",             # protein / antibody-derived tag assay
  normalization.method = "CLR"
)

LogNormalize

Standard Seurat normalization:

seu <- Seurat::NormalizeData(
  object = seu,                         # Seurat object
  normalization.method = "LogNormalize", # default normalization method
  scale.factor = 10000                  # counts are scaled to 10,000 per cell
)

Common parameters:

Parameter	Meaning
`object`	Seurat object
`normalization.method`	method used to normalize counts
`scale.factor`	target total count scale per cell

For most basic scRNA-seq practice notes, LogNormalize with scale.factor = 10000 is the starting point.

Check Normalized Data

After normalization, raw counts should still be available, and normalized data should be stored separately.

Extract raw counts:

counts <- Seurat::GetAssayData(
  object = seu,
  assay = "RNA",
  slot = "counts"
)

Extract normalized data:

data <- Seurat::GetAssayData(
  object = seu,
  assay = "RNA",
  slot = "data"
)

Check dimensions:

dim(counts)
dim(data)

They should usually have the same number of features and cells.

Counts Versus Data

Keep the distinction clear:

Slot	Meaning	Use
`counts`	raw count matrix	QC, count-based modeling, pseudobulk
`data`	normalized expression	visualization, clustering workflow, marker exploration

Do not overwrite raw counts with normalized values.

Notes

Normalization does not remove all unwanted variation. Batch effects, cell cycle effects, sample differences, and biological covariates may still remain.

Normalization also does not choose variable genes. That is the next preprocessing step.