Scope
4-variable Kidney Failure Risk Equation (KFRE) for 2- and 5-year ESKD risk. Includes NA policies, high-missingness warnings, and verbose summaries. Units: age (years), sex coded 1=male/2=female, eGFR (mL/min/1.73 m^2), UACR (mg/g); no unit conversion is performed.
Clinical context
The KFRE (Tangri et al., 2011, JAMA; validated in 3-continent cohort) estimates 2- and 5-year probability of kidney failure (ESKD) from four routine variables. It is now embedded in the KDIGO 2022 CKD guidelines for risk stratification and nephrology referral decisions.
| CKD Stage | eGFR (mL/min/1.73 m²) | 5-yr KFRE (typical range) |
|---|---|---|
| G3a | 45–59 | < 1% |
| G3b | 30–44 | 1–5% |
| G4 | 15–29 | 5–40% |
| G5 | < 15 | > 40% (often > 70%) |
Outputs KFRE_2yr and KFRE_5yr are
probabilities (0–1); multiply by 100 for percentages. The equation was
developed in adults with CKD G3–G5; interpret cautiously outside this
range.
Load packages and demo data
library(HealthMarkers)
library(tibble)
# Ten participants spanning CKD G3a → G5; mixed sex; UACR A1 → A3
df <- tibble::tibble(
age = c(45, 52, 60, 68, 72, 55, 63, 75, 80, 48),
sex = c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2 ),
eGFR = c(58, 50, 42, 35, 28, 22, 18, 14, 10, 8 ),
UACR = c(25, 80, 200, 500, 800, 1200, 2500, 4500, 8000, 6000)
)Column map (required)
col_map <- list(age = "age", sex = "sex", eGFR = "eGFR", UACR = "UACR")Core calculation
Default na_action = "keep": missing inputs propagate to
NA risks; no extreme checks.
kfre_default <- kidney_failure_risk(
data = df,
col_map = col_map,
na_action = "keep",
verbose = FALSE
)
kfre_default
#> # A tibble: 10 × 2
#> KFRE_2yr KFRE_5yr
#> <dbl> <dbl>
#> 1 0.0989 0.176
#> 2 0.125 0.219
#> 3 0.288 0.467
#> 4 0.326 0.519
#> 5 0.563 0.784
#> 6 0.515 0.739
#> 7 0.820 0.959
#> 8 0.836 0.965
#> 9 0.986 1.000
#> 10 0.922 0.991Extreme values
Extreme values will produce extreme risk estimates. Pre-filter implausible inputs before calling.
# Pre-filter example
df_filtered <- df
df_filtered$eGFR[df_filtered$eGFR > 200] <- NA
df_filtered$UACR[df_filtered$UACR > 10000] <- NA
kfre_filtered <- kidney_failure_risk(
data = df_filtered,
col_map = col_map,
na_action = "warn",
verbose = FALSE
)
kfre_filtered
#> # A tibble: 10 × 2
#> KFRE_2yr KFRE_5yr
#> <dbl> <dbl>
#> 1 0.0989 0.176
#> 2 0.125 0.219
#> 3 0.288 0.467
#> 4 0.326 0.519
#> 5 0.563 0.784
#> 6 0.515 0.739
#> 7 0.820 0.959
#> 8 0.836 0.965
#> 9 0.986 1.000
#> 10 0.922 0.991Missing data handling
keep propagates NA risks; omit drops rows
with missing required inputs; error stops on
missingness.
df_na <- df
df_na$eGFR[2] <- NA
keep_out <- kidney_failure_risk(df_na, col_map, na_action = "keep")
omit_out <- kidney_failure_risk(df_na, col_map, na_action = "omit")
list(keep_rows = nrow(keep_out), omit_rows = nrow(omit_out))
#> $keep_rows
#> [1] 10
#>
#> $omit_rows
#> [1] 9Expectations
- All four predictors must be mapped and present;
sexmust be 1/2. - Inputs must be numeric; non-finite become
NAand may propagate depending onna_action. - Output: tibble with
KFRE_2yr,KFRE_5yr; row count followsna_actionand any row filtering you apply.
Verbose diagnostics
old_opt <- options(healthmarkers.verbose = "inform")
kidney_failure_risk(
data = df,
col_map = col_map,
verbose = TRUE
)
#> kidney_failure_risk(): reading input 'df' — 10 rows × 4 variables
#> kidney_failure_risk(): col_map: age -> 'age', sex -> 'sex', eGFR -> 'eGFR', UACR -> 'UACR'
#> kidney_failure_risk(): computing markers:
#> KFRE_2yr [age, sex, eGFR, UACR -- 2-year kidney failure risk]
#> KFRE_5yr [age, sex, eGFR, UACR -- 5-year kidney failure risk]
#> kidney_failure_risk(): results: KFRE_2yr 10/10, KFRE_5yr 10/10
#> # A tibble: 10 × 2
#> KFRE_2yr KFRE_5yr
#> <dbl> <dbl>
#> 1 0.0989 0.176
#> 2 0.125 0.219
#> 3 0.288 0.467
#> 4 0.326 0.519
#> 5 0.563 0.784
#> 6 0.515 0.739
#> 7 0.820 0.959
#> 8 0.836 0.965
#> 9 0.986 1.000
#> 10 0.922 0.991
options(old_opt)Using real cohort data
The bundled simulated dataset includes eGFR and
UACR columns. Age and sex are also present, so the KFRE can
run directly on sim_small:
library(dplyr)
sim_path <- system.file("extdata", "simulated_hm_data.rds", package = "HealthMarkers")
sim_small <- dplyr::slice_head(readRDS(sim_path), n = 50)
kfre_sim <- kidney_failure_risk(
data = sim_small,
col_map = list(age = "age", sex = "sex", eGFR = "eGFR", UACR = "UACR"),
na_action = "keep"
)
dplyr::bind_cols(dplyr::select(sim_small, age, sex, eGFR, UACR), kfre_sim) |> head()
#> age sex eGFR UACR id KFRE_2yr KFRE_5yr
#> 1 62.6 1 1 5.0 P001 0.4044650 0.6177743
#> 2 38.0 1 1 4.8 P002 0.3661347 0.5708716
#> 3 38.0 1 1 9.5 P003 0.4622163 0.6836865
#> 4 48.3 1 1 26.6 P004 0.6466739 0.8549230
#> 5 81.5 2 1 16.8 P005 0.4736213 0.6960211
#> 6 82.8 2 1 13.2 P006 0.4387596 0.6576086Column recognition tip: For biobank extracts where eGFR or UACR are named differently (e.g.,
eGFR_CKD_EPI,ACR_mg_g), runhm_col_report(your_data)to check what is auto-detected before building yourcol_map.