Skip to contents

Compute four binary risk flags from routine clinical measures:

  • dyslipidemia

  • insulin_resistance

  • hyperglycemia (prediabetes-range glycemia)

  • hypertension (BP >=95th percentile via z > 1.64)

Usage

metabolic_risk_features(
  data,
  col_map = NULL,
  na_action = c("keep", "omit", "error", "ignore", "warn"),
  na_warn_prop = 0.2,
  check_extreme = FALSE,
  extreme_action = c("warn", "cap", "error", "ignore", "NA"),
  extreme_rules = NULL,
  verbose = FALSE
)

Arguments

data

A data.frame or tibble containing at least these numeric columns:

  • chol_total, chol_ldl, chol_hdl, triglycerides (mmol/L)

  • age_year (years)

  • z_HOMA (standardized HOMA-IR)

  • glucose (mmol/L)

  • HbA1c (mmol/mol; IFCC units)

  • bp_sys_z, bp_dia_z (BP z-scores)

col_map

Optional named list to map required keys to column names in data. Keys: c("chol_total","chol_ldl","chol_hdl","triglycerides","age_year", "z_HOMA","glucose","HbA1c","bp_sys_z","bp_dia_z"). Default NULL (use same names).

na_action

One of c("keep","omit","error","ignore","warn") controlling missing-data policy.

  • "keep": keep NA; outputs become NA where inputs are NA.

  • "omit": drop rows with NA in any required input.

  • "error": abort if any required input contains NA.

  • "ignore"/"warn": aliases of "keep"; "warn" also emits missingness diagnostics.

na_warn_prop

Numeric in \([0,1]\); per-variable threshold for high-missingness warnings. Default 0.2.

check_extreme

Logical; if TRUE, scan inputs for out-of-range values (see extreme_rules). Default FALSE.

extreme_action

One of c("warn","cap","error","ignore","NA") used when extremes are detected (only when check_extreme = TRUE).

  • "warn": only warn (default), "cap": truncate to range and warn,

  • "error": abort, "ignore": do nothing, "NA": set flagged inputs to NA.

extreme_rules

Optional named list of c(min,max) ranges for required keys. If NULL, broad defaults are used.

verbose

Logical; if TRUE, prints stepwise messages and a final summary. Default FALSE.

Value

A tibble with four factor columns (levels c("0","1")):

  • dyslipidemia

  • insulin_resistance

  • hyperglycemia

  • hypertension

Details

By default, behavior matches prior implementation: required columns are validated, NA values are kept (propagate to outputs), no extreme-value checks or capping are applied, and a tibble with 0/1 factor flags is returned.

Units and criteria (no automatic unit conversion):

  • Lipids (mmol/L): total cholesterol > 5.2 OR LDL-C > 3.4 OR HDL-C < 1.0 OR triglycerides > 1.1 (age 0-9) OR > 1.5 (age 10-19) => dyslipidemia = 1.

  • Insulin resistance: z_HOMA > 1.28 (~=90th percentile) => insulin_resistance = 1. z_HOMA is a within-sample or external z-score of HOMA-IR.

  • Hyperglycemia: fasting glucose in (5.6, 6.9) mmol/L OR HbA1c in (39, 47) mmol/mol => hyperglycemia = 1.

  • Hypertension: either BP z-score > 1.64 (~=95th percentile) for systolic or diastolic => hypertension = 1.

Examples

df <- data.frame(
  chol_total = c(5.2, 6.4), chol_ldl = c(3.2, 4.1), chol_hdl = c(1.3, 1.0),
  triglycerides = c(1.8, 2.5), age_year = c(45, 60), z_HOMA = c(0.5, 1.2),
  glucose = c(5.5, 6.8), HbA1c = c(38, 46), bp_sys_z = c(0.2, 1.1),
  bp_dia_z = c(0.1, 0.9)
)
metabolic_risk_features(df)
#> # A tibble: 2 × 4
#>   dyslipidemia insulin_resistance hyperglycemia hypertension
#>   <fct>        <fct>              <fct>         <fct>       
#> 1 0            0                  0             0           
#> 2 1            0                  1             0