Tutorial of QualityMeasure functions

Kenneth Nieser

This vignette shows how to use the functions in QualityMeasure to analyze quality measure performance and reliability. In the first half, we demonstrate how to use functions for measures without risk-adjustment. In the second half, we show how to incorporate risk-adjustment into the analyses.

library(devtools)
#> Loading required package: usethis
load_all()
#> ℹ Loading QualityMeasure
#> Loading required package: ggplot2
#> Loading required package: doParallel
#> Loading required package: foreach
#> Loading required package: iterators
#> Loading required package: parallel
#> Loading required package: lme4
#> Loading required package: Matrix
#> Loading required package: psych
#> 
#> Attaching package: 'psych'
#> The following objects are masked from 'package:ggplot2':
#> 
#>     %+%, alpha
#> Package 'QualityMeasure' version 1.0.0
#> 
#> This package is a work-in-progress. If you have issues or feedback, please email me at nieser@stanford.edu, so I can make this package better!
#> 
#> For more information about this package, see https://github.com/knieser/quality_measure_reliability.
#> 
#> Type 'citation('QualityMeasure')' for citing this R package in publications.
library(QualityMeasure)

Example #1: Measures without risk-adjustment

Simulate data

We’ll simulate some data to use for the example analyses in this section.

set.seed(123)

# number of accountable entities
n.entity = 100  

# average number of patients/cases per accountable entity
n.pts = 50

# parameters of the Beta distribution
alpha = 1
beta = 9

# sample the number of patients/cases per accountable entity
n = rpois(n.entity, n.pts) 
total.n = sum(n)

# sample the true performance for each entity
p = rbeta(n.entity, alpha, beta) 

# make sample data
entity = rep(1:n.entity, times = n)
y = rbinom(total.n, 1, rep(p, times = n))
df1 = data.frame(entity = entity, y = y)
Simulated data 1
entity y
1 0
1 0
1 0
1 0
1 0
1 0
1 0
1 0
1 0
1 0



Provider profiling analyses

The profiling_analysis() function can be used to obtain some basic provider profiling results.

This analysis generates confidence intervals for entity-level performance using a parametric bootstrapping method. The number of bootstraps and the number of cores to use for parallel processing can be adjusted with the controlPerf() function within profiling_analysis() as shown below.

# adjust number of bootstraps and cores for parallel processing.
n.boots = 1000
n.cores = 5

# run profiling analysis for the first dataset without risk-adjustment
profiling.results <- profiling_analysis(df = df1, ctrPerf = controlPerf(n.boots = n.boots, n.cores = n.cores))
perf.results <- profiling.results$perf.results

The estimated marginal probability of the outcome is 0.09. Recall that the true marginal probability according to our data-generation process is 0.1.

Performance summary statistics across entities
Method Min. 1st Qu. Median Mean 3rd Qu. Max.
Unadjusted 0 0.021 0.053 0.088 0.129 0.381
OE standardized 0 0.029 0.073 0.121 0.178 0.527
PE standardized 0.023 0.044 0.076 0.121 0.163 0.475
Direct standardized 0.087 0.087 0.087 0.087 0.087 0.087

Number of observations per entity

plotN(perf.results$n)

Unadjusted rates across entities

# Unadjusted performance
plotPerformance(df = perf.results)

Categorization of entities according to whether rates are lower, higher, or no different from the overall, unadjusted average can be found in the perf.results output from profiling_analysis() function.

Categorization of entities based on their outcome rates
Category Number of entities
Higher than average 15
Lower than average 17
No different than average 68

Plot of ORs comparing each entity with the “average” entity

Another way to assess whether entities have measure performance that differs from an “average” entity is to examine the predicted random intercept values.

Entities highlighted in red have estimated ORs that are statistically different from 1.0 at alpha = 0.05.

plotPerformance(df = perf.results, plot.type = 'OR')

There are 24 entities with outcome odds that are significantly different from the average entity performance.



Beta-binomial-based reliability estimates

You can calculate the Beta-Binomial estimates by using the calcBetaBin() function. Note that these estimates do not account for any risk-adjustment.

BB.results <- calcBetaBin(df = df1)
#>  Currently, Beta-Binomial reliability estimates do not account for risk-adjustment (even if you specified a model). Updates to this function to account for risk-adjustment are in progress.

The estimated alpha is 0.885 and beta is 9.253. Recall that the true values from the simulation are alpha = 1 and beta = 9.

Reliability summary statistics:

#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>  0.7849  0.8194  0.8286  0.8298  0.8394  0.8751

If you have data that is already aggregated by entity, you can calculate Beta-Binomial estimates as follows:

# Aggregated data
df.agg <- data.frame(n = aggregate(y ~ entity, data = df1, length)$y,
                     x = aggregate(y ~ entity, data = df1, sum)$y)


BB.agg.results <- calcBetaBin(df = df.agg, df.aggregate = T, n = 'n', x = 'x')
#>  Currently, Beta-Binomial reliability estimates do not account for risk-adjustment (even if you specified a model). Updates to this function to account for risk-adjustment are in progress.
#> Note that aggregated data are being used, so Beta-Binomial reliability estimates with random effects predictions cannot be calculated.

The estimated alpha is 0.885 and beta is 9.253. These estimates match our results from analyzing the data before aggregation.



Calculate reliability from all methods

If you would like to calculate reliability estimates from all methods, you can use the calcReliability() function.

The number of resamples used to calculate the permutation split-sample reliability estimate can be adjusted using the controlRel() function within calcReliability() as shown below.

# number of resamples to use for the permutation split-sample reliability estimate
n.resamples = 200

rel.out <- calcReliability(df = df1, ctrPerf = controlPerf(n.cores = n.cores), ctrRel = controlRel(n.resamples = n.resamples))
#> calculating reliability based on split-sample method...
#> ...done
#> calculating reliability based on hierarchial logistic regression model...
#> ...done
#> calculating reliability based on Beta-Binomial method...
#>  Currently, Beta-Binomial reliability estimates do not account for risk-adjustment (even if you specified a model). Updates to this function to account for risk-adjustment are in progress.
#> ...done
Reliability estimates
Method Reliability Min Reliability Max Reliability
Permutation split-sample 0.816 NA NA
Hierarchical logit regression 0.792 0.742 0.847
Beta-Binomial 0.829 0.785 0.875
plotReliability(rel.out)






Example #2: Measures with risk-adjustment

In this next part of the tutorial, we will work with an example measure where risk-adjustment is required.

Simulate data

We can use the built-in function simulateData() to simulate data from a hierarchical logistic regression model with covariates for risk-adjustment. The simulated data will include a continuous covariate x1 which is sampled from a standard Normal distribution.

# number of accountable entities
n.entity = 100  

# average number of patients/cases per accountable entity
avg.n = 50

# marginal probability of the outcome
marg.p = .1 

# reliability for entity with an average number of patients
r = .6

# implied between-entity variance
var.btwn = r/(1 - r) * (1/(avg.n*marg.p*(1 - marg.p)))

mu = log(marg.p / (1 - marg.p))
tau = c(mu, sqrt(var.btwn))

# parameter for risk-adjustment model (i.e., coefficient for x1)
theta = log(1.5)

df2 <- simulateData(n.entity = n.entity, avg.n = avg.n, tau = tau, theta = theta)
Simulated data with a covariate
entity z x1 lp p y
1 -2.425155 -2.0466725 -3.255009 0.0371473 0
1 -2.425155 1.0439689 -2.001862 0.1190076 0
1 -2.425155 -0.6431986 -2.685949 0.0638076 0
1 -2.425155 -0.1887073 -2.501669 0.0757413 0
1 -2.425155 0.3288045 -2.291836 0.0918014 0
1 -2.425155 0.0502142 -2.404795 0.0828078 0
1 -2.425155 0.4775648 -2.231519 0.0969556 0
1 -2.425155 -0.2668592 -2.533357 0.0735526 0
1 -2.425155 -1.9507917 -3.216133 0.0385631 0
1 -2.425155 0.4797582 -2.230630 0.0970335 0



Fit risk-adjustment model to the data

To incorporate risk-adjustment, we specify a model to use for risk adjustment:

model = 'y ~ x1 + (1 | entity)'

This is a model that adjusts for x1 and includes a random intercept for entity.

Estimates of model parameters

model.perf <- model_performance(df = df2, model = model)
plotEstimates(model.perf$model.results)

The estimated value of the regression coefficient, after exponentiating is, 1.47 with a 95% confidence interval ranging from 1.33 to 1.62. Recall that the true value from the simulation is 1.5.

The estimated between-entity variance is 0.396. Recall that the true value from the simulation is 0.333.

Discrimination

The estimated c-statistic is 0.73, and below is a plot of the distributions of predicted probabilities separately for observations with and without the outcome occurring.

plotPredictedDistribution(df = model.perf$df)

Calibration

Below is a plot of the observed rate of the outcome for quantiles of the predicted probabilities from the model. The number of quantiles can be specified through the quantiles argument of the plotCalibration() function.

plotCalibration(df = model.perf$df, quantiles = 10)



Provider profiling analyses

# run profiling analysis for the first dataset without risk-adjustment
profiling.results2 <- profiling_analysis(df = df2, model = model, ctrPerf = controlPerf(n.boots = n.boots, n.cores = n.cores))
perf.results2 <- profiling.results2$perf.results

The estimated marginal probability of the outcome is 0.1. Recall that the true value of the marginal probability from the simulation is 0.1.

Performance summary statistics across entities
Method Min. 1st Qu. Median Mean 3rd Qu. Max.
Unadjusted 0 0.053 0.088 0.106 0.147 0.442
OE standardized 0 0.058 0.095 0.114 0.152 0.477
PE standardized 0.046 0.076 0.098 0.114 0.135 0.389
Direct standardized 0.105 0.105 0.105 0.105 0.105 0.105

Number of observations per entity

plotN(perf.results2$n)

Unadjusted rates across entities

plotPerformance(df = perf.results2)

OE-risk-standardized rates across entities

plotPerformance(df = perf.results2, plot.y = 'oe')

Categorization of entities according to whether their risk-standardized rates are lower, higher, or no different from the overall, unadjusted average can be found in the perf.results output from profiling_analysis() function.

Categorization of entities based on OE-risk-standardized rates
Category Number of entities
Higher than average 4
No different than average 96

PE-risk-standardized rates across entities

plotPerformance(df = perf.results2, plot.y = 'pe')

Categorization of entities based on PE-risk-standardized rates
Category Number of entities
Higher than average 4
No different than average 96

Plot of ORs comparing each entity with the “average” entity

Another way to assess whether entities have measure performance that differs from an “average” entity is to examine the predicted random intercept values.

Entities highlighted in red have estimated ORs that are statistically different from 1.0 at alpha = 0.05.

plotPerformance(df = perf.results2, plot.type = 'OR')

There are 7 entities with outcome adjusted odds that are significantly different from the average entity performance.



Calculate reliability from all methods

Again, to calculate reliability estimates from all methods, we use the calcReliability() function. This time we specify the risk-adjustment model.

rel.out2 <- calcReliability(df = df2, model = model, ctrPerf = controlPerf(n.cores = n.cores), ctrRel = controlRel(n.resamples = n.resamples))
#> calculating reliability based on split-sample method...
#> ...done
#> calculating reliability based on hierarchial logistic regression model...
#> ...done
#> calculating reliability based on Beta-Binomial method...
#>  Currently, Beta-Binomial reliability estimates do not account for risk-adjustment (even if you specified a model). Updates to this function to account for risk-adjustment are in progress.
#> ...done
Reliability estimates
Method Reliability Min Reliability Max Reliability
Permutation split-sample 0.680 NA NA
Hierarchical logit regression 0.630 0.532 0.709
Beta-Binomial 0.644 0.556 0.724
plotReliability(rel.out2)