Cell Composition
Purpose
After cell annotation, a natural next step is to examine cell type composition.
Typical questions:
- how many cells are assigned to each cell type?
- what fraction of each sample is each cell type?
- does cell type composition differ between conditions?
Cell composition is descriptive unless a proper statistical model is used.
Required Metadata
The Seurat object should contain cell type labels:
table(seu$cell_type)It should also contain sample or condition metadata:
table(seu$sample)
table(seu$condition)Use sample-level metadata when comparing groups. Do not rely only on pooled cell counts across all samples.
Cell Type Counts
Count cells by cell type:
cell_type_counts <- as.data.frame(
table(seu$cell_type)
)
colnames(cell_type_counts) <- c("cell_type", "n_cells")
cell_type_countsCount cells by sample and cell type:
sample_cell_type_counts <- as.data.frame(
table(seu$sample, seu$cell_type)
)
colnames(sample_cell_type_counts) <- c(
"sample",
"cell_type",
"n_cells"
)
sample_cell_type_countsCell Type Proportions
Calculate cell type proportions within each sample:
sample_cell_type_props <- sample_cell_type_counts |>
dplyr::group_by(sample) |>
dplyr::mutate(
prop = n_cells / sum(n_cells)
) |>
dplyr::ungroup()
sample_cell_type_propsAdd condition metadata if each sample has one condition:
sample_metadata <- seu@meta.data |>
dplyr::select(sample, condition) |>
dplyr::distinct()
sample_cell_type_props <- sample_cell_type_props |>
dplyr::left_join(
sample_metadata,
by = "sample"
)Plot Composition
Stacked bar plot by sample:
ggplot2::ggplot(
data = sample_cell_type_props,
mapping = ggplot2::aes(
x = sample,
y = prop,
fill = cell_type
)
) +
ggplot2::geom_col(width = 0.8) +
ggplot2::scale_y_continuous(labels = scales::percent) +
ggplot2::labs(
x = "Sample",
y = "Cell type proportion",
fill = "Cell type"
) +
ggplot2::theme_classic()Stacked bar plot grouped by condition:
ggplot2::ggplot(
data = sample_cell_type_props,
mapping = ggplot2::aes(
x = condition,
y = prop,
fill = cell_type
)
) +
ggplot2::geom_col(
position = "fill",
width = 0.8
) +
ggplot2::scale_y_continuous(labels = scales::percent) +
ggplot2::labs(
x = "Condition",
y = "Cell type proportion",
fill = "Cell type"
) +
ggplot2::theme_classic()Compare One Cell Type
Plot sample-level proportions for one cell type:
target_cell_type <- "T cell"
one_cell_type <- sample_cell_type_props |>
dplyr::filter(cell_type == target_cell_type)
ggplot2::ggplot(
data = one_cell_type,
mapping = ggplot2::aes(
x = condition,
y = prop,
color = condition
)
) +
ggplot2::geom_boxplot(outlier.shape = NA) +
ggplot2::geom_jitter(width = 0.15, height = 0) +
ggplot2::scale_y_continuous(labels = scales::percent) +
ggplot2::labs(
x = "Condition",
y = paste(target_cell_type, "proportion")
) +
ggplot2::theme_classic()Each point should represent a sample, not an individual cell.
Note
Cell composition plots are useful for exploration.
For formal differential abundance analysis, use methods designed for sample-level or replicate-aware testing. Do not treat cells from the same sample as independent biological replicates.