Fits one or more logistic regression models for each exposure variable and returns a tidy result table suitable for downstream forest plots. By default, two standard adjustment models are always included:
Usage
assoc_logistic(
data,
outcome_col,
exposure_col,
covariates = NULL,
base = TRUE,
test = c("wald", "lrt"),
ci_method = c("wald", "profile"),
conf_level = 0.95
)
assoc_logit(
data,
outcome_col,
exposure_col,
covariates = NULL,
base = TRUE,
test = c("wald", "lrt"),
ci_method = c("wald", "profile"),
conf_level = 0.95
)Arguments
- data
(data.frame or data.table) Analysis dataset.
- outcome_col
(character) Binary outcome column (
0/1orTRUE/FALSE).- exposure_col
(character) One or more exposure variable names.
- covariates
(character or NULL) Covariate column names for the Fully adjusted model. Default:
NULL.- base
(logical) Include Unadjusted and Age and sex adjusted models. Default:
TRUE.- test
(character) P-value method for logistic models:
"wald"(default) or"lrt".- ci_method
(character) CI calculation method:
"wald"(default) or"profile".- conf_level
(numeric) Confidence level. Default:
0.95.
Value
A data.table with one row per exposure \(\times\) term
\(\times\) model combination, and columns:
exposureExposure variable name.
termCoefficient name (e.g.
"bmi_categoryObese").modelOrdered factor:
Unadjusted<Age and sex adjusted<Fully adjusted.nParticipants in model (after NA removal).
n_casesNumber of cases (outcome = 1) in model.
OROdds ratio (point estimate).
CI_lowerLower confidence bound.
CI_upperUpper confidence bound.
p_valueP-value from the method selected by
test.OR_labelFormatted string, e.g.
"1.23 (1.05-1.44)".
Details
Unadjusted - no covariates (crude).
Age and sex adjusted - age + sex auto-detected from standard UKB names (
p21022/p31) or decoded names (age_at_recruitment/sex). Errors if either column cannot be found.Fully adjusted - the covariates supplied via the
covariatesargument. Only run whencovariatesis non-NULL.
Outcome coding: outcome_col may be logical
(TRUE/FALSE) or integer/numeric (0/1).
Logical values are converted to integer internally.
CI methods:
"wald"(default) - fast, appropriate for large UKB samples."profile"- profile likelihood CI viaconfint.glm(); slower but more accurate for small or sparse data.
P-value method: test = "wald" returns coefficient-level Wald
p-values from summary.glm(). test = "lrt" returns the
exposure-level likelihood-ratio p-value from single-term deletion
(drop1(..., test = "Chisq")); for factor exposures, the same overall
exposure p-value is repeated across the non-reference level rows.
Examples
dt <- ops_toy(scenario = "association", n = 500)
#> ✔ ops_toy: 500 participants | 33 columns | scenario = "association" | seed = 42
res <- assoc_logistic(
data = dt,
outcome_col = "dm_status",
exposure_col = "p20116_i0",
covariates = c("bmi_cat", "tdi_cat"),
base = FALSE
)
#> ℹ outcome_col dm_status: logical detected, converting TRUE/FALSE -> 1/0
#>
#> ── assoc_logistic ──────────────────────────────────────────────────────────────
#> ℹ 1 exposure x 1 model = 1 logistic regression
#> ℹ Input cohort: 500 participants | test: wald | CI method: wald (n/n_cases reflect each model's actual analysis set)
#>
#> ── p20116_i0 ──
#>
#> ✔ Fully adjusted | p20116_i0Previous: OR 0.73 (0.39-1.35), p = 0.313
#> ✔ Fully adjusted | p20116_i0Current: OR 0.70 (0.30-1.64), p = 0.406
#> ✔ Done: 2 result rows across 1 exposure and 1 model.
