Heterogeneity & Demographic Analysis
2024-02-09
Source:vignettes/g_heterogeneity.Rmd
g_heterogeneity.Rmd
Introduction
Heterogeneity analysis is a way to explore how the results of a model can vary depending on the characteristics of individuals in a population, and demographic analysis estimates the average values of a model over an entire population.
In practice these two analyses naturally complement each other: heterogeneity analysis runs the model on multiple sets of parameters (reflecting different characteristics found in the target population), and demographic analysis combines the results.
For this example we will use the result from the assessment of a new
total hip replacement previously described in
vignette("d-non-homogeneous", "heemod")
.
Population characteristics
The characteristics of the population are input from a table, with one column per parameter and one row per individual. Those may be for example the characteristics of the indiviuals included in the original trial data.
For this example we will use the characteristics of 100 individuals,
with varying sex and age, specified in the data frame
tab_indiv
:
tab_indiv
## # A tibble: 100 × 2
## age sex
## <dbl> <int>
## 1 56 1
## 2 61 0
## 3 47 1
## 4 52 0
## 5 42 0
## 6 71 1
## 7 61 1
## 8 70 1
## 9 60 1
## 10 50 1
## # ℹ 90 more rows
library(ggplot2)
ggplot(tab_indiv, aes(x = age)) +
geom_histogram(binwidth = 2)
Running the analysis
res_mod
, the result we obtained from
run_model()
in the Time-varying Markov models
vignette, can be passed to update()
to update the model
with the new data and perform the heterogeneity analysis.
res_h <- update(res_mod, newdata = tab_indiv)
## No weights specified in update, using equal weights.
## Updating strategy 'standard'...
## Updating strategy 'np1'...
Interpreting results
The summary()
method reports summary statistics for
cost, effect and ICER, as well as the result from the combined
model.
summary(res_h)
## An analysis re-run on 100 parameter sets.
##
## * Unweighted analysis.
##
## * Values distribution:
##
## Min. 1st Qu. Median Mean
## standard - Cost 543.46225608 605.0062810 626.3537753 683.4291651
## standard - Effect 10.06345874 23.3226486 27.7806580 25.8050372
## standard - Cost Diff. - - - -
## standard - Effect Diff. - - - -
## standard - Icer - - - -
## np1 - Cost 618.86571941 635.5509751 641.3547975 657.9045025
## np1 - Effect 10.13073146 23.4706053 27.9754765 26.0569701
## np1 - Cost Diff. -165.40882382 -99.5031416 15.0010223 -25.5246626
## np1 - Effect Diff. 0.06727271 0.1756522 0.2162479 0.2519328
## np1 - Icer -354.56585682 -304.0330575 65.6679900 7.0276560
## 3rd Qu. Max.
## standard - Cost 786.6690449 878.7813785
## standard - Effect 29.0596426 31.5292548
## standard - Cost Diff. - -
## standard - Effect Diff. - -
## standard - Icer - -
## np1 - Cost 687.1659033 713.3725547
## np1 - Effect 29.2683350 31.7651919
## np1 - Cost Diff. 30.5446941 75.4034633
## np1 - Effect Diff. 0.3272774 0.4665109
## np1 - Icer 156.7853582 956.9156706
##
## * Combined result:
##
## 2 strategies run for 60 cycles.
##
## Initial state counts:
##
## PrimaryTHR = 1000L
## SuccessP = 0L
## RevisionTHR = 0L
## SuccessR = 0L
## Death = 0L
##
## Counting method: 'beginning'.
##
## Values:
##
## utility cost
## standard 25805.04 683429.2
## np1 26056.97 657904.5
##
## Efficiency frontier:
##
## np1
##
## Differences:
##
## Cost Diff. Effect Diff. ICER Ref.
## np1 -25.52466 0.2519328 -101.3153 standard
The variation of cost or effect can then be plotted.
plot(res_h, result = "effect", binwidth = 5)
plot(res_h, result = "cost", binwidth = 50)
plot(res_h, result = "icer", type = "difference",
binwidth = 500)
plot(res_h, result = "effect", type = "difference",
binwidth = .1)
plot(res_h, result = "cost", type = "difference",
binwidth = 30)
The results from the combined model can be plotted similarly to the
results from run_model()
.
plot(res_h, type = "counts")
Weighted results
Weights can be used in the analysis by including an optional column
.weights
in the new data to specify the respective weights
of each strata in the target population.
tab_indiv_w
## # A tibble: 100 × 3
## age sex .weights
## <dbl> <int> <dbl>
## 1 45 0 0.0590
## 2 59 1 0.228
## 3 61 1 0.850
## 4 46 0 0.844
## 5 67 0 0.952
## 6 66 1 0.480
## 7 53 0 0.245
## 8 76 1 0.659
## 9 66 0 0.0165
## 10 55 1 0.165
## # ℹ 90 more rows
res_w <- update(res_mod, newdata = tab_indiv_w)
## Updating strategy 'standard'...
## Updating strategy 'np1'...
res_w
## An analysis re-run on 100 parameter sets.
##
## * Weights distribution:
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00389 0.22690 0.45408 0.46025 0.67205 0.99822
##
## Total weight: 46.02452
##
## * Values distribution:
##
## Min. 1st Qu. Median Mean
## standard - Cost 438.70535048 613.9316623 629.415647 691.1999825
## standard - Effect 6.12465030 24.4991251 27.780658 26.0184616
## standard - Cost Diff. - - - -
## standard - Effect Diff. - - - -
## standard - Icer - - - -
## np1 - Cost 590.76054210 637.9767000 642.187795 660.1213847
## np1 - Effect 6.13624942 24.8264025 27.975477 26.2797727
## np1 - Cost Diff. -165.40882382 -122.7948420 12.772148 -31.0785979
## np1 - Effect Diff. 0.01159912 0.1959412 0.220806 0.2613111
## np1 - Icer -354.56585682 -327.6476693 54.727146 212.7581144
## 3rd Qu. Max.
## standard - Cost 819.1977737 8.787814e+02
## standard - Effect 29.1382106 3.152925e+01
## standard - Cost Diff. - -
## standard - Effect Diff. - -
## standard - Icer - -
## np1 - Cost 696.4029317 7.133726e+02
## np1 - Effect 29.3758145 3.176519e+01
## np1 - Cost Diff. 24.0450377 1.520552e+02
## np1 - Effect Diff. 0.3747771 4.665109e-01
## np1 - Icer 115.2176112 1.310920e+04
##
## * Combined result:
##
## 2 strategies run for 60 cycles.
##
## Initial state counts:
##
## PrimaryTHR = 1000L
## SuccessP = 0L
## RevisionTHR = 0L
## SuccessR = 0L
## Death = 0L
##
## Counting method: 'beginning'.
##
## Values:
##
## utility cost
## standard 26018.46 691200.0
## np1 26279.77 660121.4
##
## Efficiency frontier:
##
## np1
##
## Differences:
##
## Cost Diff. Effect Diff. ICER Ref.
## np1 -31.0786 0.2613111 -118.9333 standard
Parallel computing
Updating can be significantly sped up by using parallel computing. This can be done in the following way:
- Define a cluster with the
use_cluster()
functions (i.e.use_cluster(4)
to use 4 cores). - Run the analysis as usual.
- To stop using parallel computing use the
close_cluster()
function.
Results may vary depending on the machine, but we found speed gains to be quite limited beyond 4 cores.