Skip to contents

Authentication

Connect to the UK Biobank Research Analysis Platform (RAP) and manage project selection.

auth_list_projects()
List available DNAnexus projects
auth_login()
Login to DNAnexus with a token
auth_logout()
Logout from DNAnexus
auth_select_project()
Select a DNAnexus project
auth_status()
Check current DNAnexus authentication status

Fetch — RAP File System

Explore and retrieve files from RAP project storage.

fetch_ls()
List files and folders at a remote RAP path
fetch_tree()
Print a remote RAP directory tree
fetch_url()
Get pre-authenticated download URL(s) for a remote RAP file or folder
fetch_file()
Download a file from RAP project storage
fetch_metadata()
Download the Showcase metadata folder
fetch_field()
Download the UKB field dictionary file

Extract — Phenotype Data

Extract UKB fields from the RAP dataset into R.

extract_ls()
List all approved fields in the UKB dataset
extract_pheno()
Extract phenotype data from a UKB dataset
extract_batch()
Submit a large-scale phenotype extraction job via table-exporter

Decode — Column Names and Values

Convert raw UKB column names and coded values to human-readable labels.

decode_names()
Rename UKB field ID columns to human-readable snake_case names
decode_values()
Decode UKB categorical column values using Showcase metadata

Derive — Disease Phenotypes

Build case definitions from HES, cancer registry, self-report, and First Occurrence data sources.

derive_selfreport()
Define a self-reported phenotype from UKB touchscreen data
derive_hes()
Derive a binary disease flag from UKB HES inpatient diagnoses
derive_cancer_registry()
Derive a binary disease flag from UKB cancer registry
derive_death_registry()
Derive a binary disease flag from UKB death registry
derive_first_occurrence()
Derive a binary disease flag from UKB First Occurrence fields
derive_icd10()
Derive a unified ICD-10 disease flag across multiple UKB data sources
derive_case()
Combine self-report and ICD-10 sources into a unified case definition

Derive — Covariates and Timing

Derive continuous covariates, categorical cuts, follow-up time, and event timing variables.

derive_covariate()
Prepare UKB covariates for analysis
derive_cut()
Cut a continuous UKB variable into quantile-based or custom groups
derive_missing()
Handle informative missing labels in UKB decoded data
derive_age()
Compute age at event for one or more UKB outcomes
derive_followup()
Compute follow-up end date and follow-up time for survival analysis
derive_timing()
Classify disease timing relative to UKB baseline assessment

Jobs — Monitoring and Retrieval

Submit, monitor, and retrieve results from RAP extraction jobs.

job_ls()
List recent DNAnexus jobs in the current project
job_path()
Get the RAP file path of a completed DNAnexus job output
job_result()
Load the result of a completed DNAnexus job into R
job_status()
Check the current state of a DNAnexus job
job_wait()
Wait for a DNAnexus job to finish

Association Analysis

Fit regression models for UKB outcomes with automatic three-model adjustment framework.

assoc_coxph() assoc_cox()
Cox proportional hazards association analysis
assoc_logistic() assoc_logit()
Logistic regression association analysis
assoc_linear() assoc_lm()
Linear regression association analysis
assoc_coxph_zph() assoc_zph()
Proportional hazards assumption test for Cox regression
assoc_subgroup() assoc_sub()
Subgroup association analysis with optional interaction test
assoc_trend() assoc_tr()
Dose-response trend analysis
assoc_competing() assoc_fg()
Fine-Gray competing risks association analysis
assoc_lag()
Cox regression lag sensitivity analysis

GRS — Genetic Risk Scores

End-to-end RAP-native pipeline for computing and validating polygenic risk scores with plink2.

grs_check()
Check and export a GRS weights file
grs_bgen2pgen()
Convert UKB imputed BGEN files to PGEN on RAP
grs_score()
Calculate genetic risk scores from PGEN files on RAP
grs_standardize() grs_zscore()
Standardise GRS columns by Z-score transformation
grs_validate()
Validate GRS predictive performance

Utilities & Diagnostics

Environment checks, synthetic data generation, missing-value summaries, pipeline snapshots, and cohort management.

ops_setup()
Check the ukbflow operating environment
ops_toy()
Generate toy UKB-like data for testing and development
ops_na()
Summarise missing values by column
ops_snapshot()
Record and review dataset pipeline snapshots
ops_snapshot_cols()
Retrieve column names recorded at a snapshot
ops_snapshot_diff()
Compare column names between two snapshots
ops_snapshot_remove()
Remove raw source columns recorded at a snapshot
ops_set_safe_cols()
Register additional safe columns protected from snapshot-based drops
ops_withdraw()
Exclude withdrawn participants from a dataset

Visualisation

Publication-quality forest plots and Table 1 for manuscripts.

plot_forest()
Publication-ready forest plot
plot_tableone()
Publication-ready Table 1 (Baseline Characteristics)