Title: | Matching and Weighting Multiply Imputed Datasets |
---|---|
Description: | Provides essential tools for the pre-processing techniques of matching and weighting multiply imputed datasets. The package includes functions for matching within and across multiply imputed datasets using various methods, estimating weights for units in the imputed datasets using multiple weighting methods, calculating causal effect estimates in each matched or weighted dataset using parametric or non-parametric statistical models, and pooling the resulting estimates according to Rubin's rules (please see <https://journal.r-project.org/archive/2021/RJ-2021-073/> for more details). |
Authors: | Farhad Pishgar [aut, cre], Noah Greifer [aut], Clémence Leyrat [ctb], Elizabeth Stuart [ctb] |
Maintainer: | Farhad Pishgar <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2.2 |
Built: | 2024-10-24 17:06:20 UTC |
Source: | https://github.com/farhadpishgar/matchthem |
mimids
objectCreates a mimids
object from a list of matchit
objects and an imputed dataset.
as.mimids(x, ...) ## Default S3 method: as.mimids(x, datasets, ...)
as.mimids(x, ...) ## Default S3 method: as.mimids(x, datasets, ...)
x |
A list of |
... |
Ignored. |
datasets |
This argument specifies the datasets containing the exposure and the potential confounders called in the |
The matched datasets are stored as though matchthem()
was called with approach = "within"
.
A mimids
object.
matchthem()
, mimids
, MatchIt::matchit()
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5, printFlag = FALSE) #Matching the multiply imputed datasets manually match.list <- lapply(1:5, function(i) { MatchIt::matchit(OSP ~ AGE + SEX + BMI + RAC + SMK, mice::complete(imputed.datasets, i), method = 'nearest') }) #Creating mimids object matched.datasets <- as.mimids(match.list, imputed.datasets)
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5, printFlag = FALSE) #Matching the multiply imputed datasets manually match.list <- lapply(1:5, function(i) { MatchIt::matchit(OSP ~ AGE + SEX + BMI + RAC + SMK, mice::complete(imputed.datasets, i), method = 'nearest') }) #Creating mimids object matched.datasets <- as.mimids(match.list, imputed.datasets)
wimids
objectCreates a wimids
object from a list of weightit
objects and an imputed dataset.
as.wimids(x, ...) ## Default S3 method: as.wimids(x, datasets, ...)
as.wimids(x, ...) ## Default S3 method: as.wimids(x, datasets, ...)
x |
A list of |
... |
Ignored. |
datasets |
The datasets containing the exposure and covariates mentioned in the |
The weighted datasets are stored as though weightthem()
was called with approach = "within"
.
A wimids
object.
weightthem()
, wimids
, WeightIt::weightit()
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5, printFlag = FALSE) #Matching the multiply imputed datasets manually weight.list <- lapply(1:5, function(i) { WeightIt::weightit(OSP ~ AGE + SEX + BMI + RAC + SMK, mice::complete(imputed.datasets, i), method = 'glm', estimand = 'ATT') }) #Creating wimids object weighted.datasets <- as.wimids(weight.list, imputed.datasets)
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5, printFlag = FALSE) #Matching the multiply imputed datasets manually weight.list <- lapply(1:5, function(i) { WeightIt::weightit(OSP ~ AGE + SEX + BMI + RAC + SMK, mice::complete(imputed.datasets, i), method = 'glm', estimand = 'ATT') }) #Creating wimids object weighted.datasets <- as.wimids(weight.list, imputed.datasets)
mimids
and wimids
Objects by ColumnsThis function combines a mimids
or wimids
object columnwise with additional datasets or variables. Typically these would be variables not included in the original multiple imputation and therefore absent in the mimids
or wimids
object. with()
can then be used on the output to run models with the added variables.
cbind(..., deparse.level = 1) ## S3 method for class 'mimids' cbind(..., deparse.level = 1) ## S3 method for class 'wimids' cbind(..., deparse.level = 1)
cbind(..., deparse.level = 1) ## S3 method for class 'mimids' cbind(..., deparse.level = 1) ## S3 method for class 'wimids' cbind(..., deparse.level = 1)
... |
Objects to combine columnwise. The first argument should be a |
deparse.level |
Ignored. |
An object with the same class as the first input object with the additional datasets or variables added to the components.
Farhad Pishgar and Noah Greifer
#Loading libraries library(survey) #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Weighting the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within') #Adding additional variables weighted.datasets <- cbind(weighted.datasets, logAGE = log(osteoarthritis$AGE)) #Using the additional variables in an analysis models <- with(weighted.datasets, svyglm(KOA ~ OSP + logAGE, family = quasibinomial)) #Pooling results obtained from analyzing the datasets results <- pool(models) summary(results)
#Loading libraries library(survey) #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Weighting the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within') #Adding additional variables weighted.datasets <- cbind(weighted.datasets, logAGE = log(osteoarthritis$AGE)) #Using the additional variables in an analysis models <- with(weighted.datasets, svyglm(KOA ~ OSP + logAGE, family = quasibinomial)) #Pooling results obtained from analyzing the datasets results <- pool(models) summary(results)
complete()
extracts data from an object of the mimids
or wimids
class.
## S3 method for class 'mimids' complete(data, action = 1, include = FALSE, mild = FALSE, all = TRUE, ...) ## S3 method for class 'wimids' complete(data, action = 1, include = FALSE, mild = FALSE, all = TRUE, ...)
## S3 method for class 'mimids' complete(data, action = 1, include = FALSE, mild = FALSE, all = TRUE, ...) ## S3 method for class 'wimids' complete(data, action = 1, include = FALSE, mild = FALSE, all = TRUE, ...)
data |
A |
action |
The imputed dataset number, intended to extract its data, or an action. The input must be a positive integer or a keyword. The keywords include |
include |
Whether the original data with the missing values should be included. The input must be a logical value. The default is |
mild |
Whether the return value should be an object of |
all |
Whether to include observations with a zero estimated weight. The default is |
... |
Ignored. |
complete()
works by running mice::complete()
on the mids
object stored within the mimids
or wimids
object and appending the outputs of the matching or weighting procedure. For mimids
objects, the appended outputs include the matching weights, the propensity score (if included), pair membership (if included), and whether each unit was discarded. For wimids
objects, the appended output is the estimated weights.
This function returns the imputed dataset within the supplied mimids
or wimids
objects.
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Matching the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'nearest') #Extracting the first imputed dataset matched.dataset.1 <- complete(matched.datasets, n = 1)
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Matching the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'nearest') #Extracting the first imputed dataset matched.dataset.1 <- complete(matched.datasets, n = 1)
mimids
Classis.mimids()
function checks whether class of objects is mimids
or not.
is.mimids(object)
is.mimids(object)
object |
This argument specifies the object that should be checked to see if it is of the |
The class of objects is checked to be of the mimids
.
This function returns a logical value indicating whether object
is of the mimids
class.
Farhad Pishgar
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Matching the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'nearest') #Checking the 'matched.datasets' object is.mimids(matched.datasets)
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Matching the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'nearest') #Checking the 'matched.datasets' object is.mimids(matched.datasets)
mimipo
Classis.mimipo()
function checks whether class of objects is mimipo
or not.
is.mimipo(object)
is.mimipo(object)
object |
This argument specifies the object that should be checked to see if it is of the |
The class of objects is checked to be of the mimipo
.
This function returns a logical value indicating whether object
is of the mimipo
class.
Farhad Pishgar
#Loading libraries library(survey) #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = "ATT") #Analyzing the weighted datasets models <- with(data = weighted.datasets, exp = svyglm(KOA ~ OSP, family = binomial)) #Pooling results obtained from analysing the datasets results <- pool(models) #Checking the 'results' object is.mimipo(results)
#Loading libraries library(survey) #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = "ATT") #Analyzing the weighted datasets models <- with(data = weighted.datasets, exp = svyglm(KOA ~ OSP, family = binomial)) #Pooling results obtained from analysing the datasets results <- pool(models) #Checking the 'results' object is.mimipo(results)
mimira
Classis.mimira()
function checks whether class of objects is mimira
or not.
is.mimira(object)
is.mimira(object)
object |
This argument specifies the object that should be checked to see if it is of the |
The class of objects is checked to be of the mimira
.
This function returns a logical value indicating whether object
is of the mimira
class.
Farhad Pishgar
#Loading libraries library(survey) #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = "ATT") #Analyzing the weighted datasets models <- with(weighted.datasets, svyglm(KOA ~ OSP, family = binomial)) #Checking the 'models' object is.mimira(models)
#Loading libraries library(survey) #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = "ATT") #Analyzing the weighted datasets models <- with(weighted.datasets, svyglm(KOA ~ OSP, family = binomial)) #Checking the 'models' object is.mimira(models)
wimids
Classis.wimids()
function checks whether class of objects is wimids
or not.
is.wimids(object)
is.wimids(object)
object |
This argument specifies the object that should be checked to see if it is of the |
The class of objects is checked to be of the wimids
.
This function returns a logical value indicating whether object
is of the wimids
class.
Farhad Pishgar
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = "ATT") #Checking the 'weighted.datasets' object is.wimids(weighted.datasets)
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = "ATT") #Checking the 'weighted.datasets' object is.wimids(weighted.datasets)
matchthem()
performs matching in the supplied multiply imputed datasets, given as mids
or amelia
objects, by running MatchIt::matchit()
on each of the multiply imputed datasets with the supplied arguments.
matchthem( formula, datasets, approach = "within", method = "nearest", distance = "glm", link = "logit", distance.options = list(), discard = "none", reestimate = FALSE, ... )
matchthem( formula, datasets, approach = "within", method = "nearest", distance = "glm", link = "logit", distance.options = list(), discard = "none", reestimate = FALSE, ... )
formula |
A |
datasets |
This argument specifies the datasets containing the exposure and the potential confounders called in the |
approach |
The approach that should be used to combine information in multiply imputed datasets. Currently, |
method |
This argument specifies a matching method. Currently, |
distance |
The method used to estimate the distance measure (e.g., propensity scores) used in matching, if any. Only options that specify a method of estimating propensity scores (i.e., not |
link , distance.options , discard , reestimate
|
Arguments passed to |
... |
Additional arguments passed to |
If an amelia
object is supplied to datasets
, it will be transformed into a mids
object for further use. matchthem()
works by calling mice::complete()
on the mids
object to extract a complete dataset, and then calls MatchIt::matchit()
on each one, storing the output of each matchit()
call and the mids
in the output. All arguments supplied to matchthem()
except datasets
and approach
are passed directly to matchit()
. With the "across"
approach, the estimated propensity scores are averaged across multiply imputed datasets and re-supplied to another set of calls to matchit()
.
An object of the mimids()
(matched multiply imputed datasets) class, which includes the supplied mids
object (or an amelia
object transformed into a mids
object if supplied) and the output of the calls to matchit()
on each multiply imputed dataset.
Farhad Pishgar and Noah Greifer
Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis, 15(3): 199-236. https://gking.harvard.edu/files/abs/matchp-abs.shtml
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
Gary King, James Honaker, Anne Joseph, and Kenneth Scheve (2001). Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation. American Political Science Review, 95: 49–69. https://gking.harvard.edu/files/abs/evil-abs.shtml
#1 #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Matching the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'nearest') #2 #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- Amelia::amelia(osteoarthritis, m = 5, noms = c("SEX", "RAC", "SMK", "OSP", "KOA")) #Matching the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'across', method = 'nearest')
#1 #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Matching the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'nearest') #2 #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- Amelia::amelia(osteoarthritis, m = 5, noms = c("SEX", "RAC", "SMK", "OSP", "KOA")) #Matching the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'across', method = 'nearest')
mimids
object contains data of matched multiply imputed datasets. mimids
objects are generated by calls to matchthem()
.
mimids
objects have methods for print()
, summary()
, plot()
, and cbind()
.
The MatchThem package does not use the S4 class definitions and instead relies on the S3 list equivalents.
Farhad Pishgar
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
mimipo
object contains data of multiply imputed pooled outcome. mimipo
objects are generated by calls to pool()
.
mimipo
objects has methods for the print()
and summary()
functions (please see mice package reference manual for details).
The MatchThem package does not use the S4 class definitions and instead relies on the S3 list equivalents.
Farhad Pishgar
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
mimira
object contains data of multiply imputed repeated analyses. mimira
objects are generated by calls to with()
.
mimira
objects has methods for the print()
and summary()
functions (please see mice package reference manual for details).
The MatchThem package does not use the S4 class definitions and instead relies on the S3 list equivalents.
Farhad Pishgar
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
osteoarthritis
includes demographic data of 2,585 units (individuals) with or at risk of knee osteoarthritis. The recorded data has missing values in body mass index (BMI
, a quantitative variable), race (RAC
, a categorical qualitative variable), smoking status (SMK
, a binary qualitative variable), and knee osteoarthritis status at follow-up (KOA
, a binary qualitative variable).
osteoarthritis
osteoarthritis
This dataset contains 2,585 rows and 7 columns. Each row presents data of an unit (individual) and each column presents data of a characteristic of that unit. The columns are:
Age of each unit (individual);
Gender of each unit (individual), coded as 0
(female) and 1
(male);
Estimated body mass index of each unit (individual);
Race of each unit (individual), coded as 0
(other), 1
(Caucasian), 2
(African American), and 3
(Asian);
The smoking status of each unit (individual), coded as 0
(non-smoker) and 1
(smoker);
Osteoporosis status of each unit (individual) at baseline, coded as 0
(negative) and 1
(positive); and
Knee osteoarthritis status of each unit (individual) in the follow-up, coded as 0
(at risk) and 1
(diagnosed).
The information presented in the osteoarthritis
dataset is based on the publicly available data of the Osteoarthritis Initiative (OAI) project (see https://nda.nih.gov/oai/ for details), with changes.
pool()
pools estimates from the analyses done within each multiply imputed dataset. The typical sequence of steps to do a matching or weighting procedure on multiply imputed datasets are:
Multiply impute the missing values using the mice()
function (from the mice package) or the amelia()
function (from the Amelia package), resulting in a multiply imputed dataset (an object of the mids
or amelia
class);
Match or weight each multiply imputed dataset using matchthem()
or weightthem()
, resulting in an object of the mimids
or wimids
class;
Check the extent of balance of covariates in the datasets (using functions from the cobalt package);
Fit the statistical model of interest on each dataset by the with()
function, resulting in an object of the mimira
class; and
Pool the estimates from each model into a single set of estimates and standard errors, resulting in an object of the mimipo
class.
pool(object, dfcom = NULL)
pool(object, dfcom = NULL)
object |
An object of the |
dfcom |
A positive number representing the degrees of freedom in the data analysis. The default is |
pool()
function averages the estimates of the model and computes the total variance over the repeated analyses by Rubin’s rules. It calls mice::pool()
after computing the model degrees of freedom.
This function returns an object from the mimipo
class. Methods for mimipo
objects (e.g., print()
, summary()
, etc.) are imported from the mice package.
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
#Loading libraries #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Weighting the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm') #Analyzing the weighted datasets models <- with(weighted.datasets, WeightIt::glm_weightit(KOA ~ OSP, family = binomial)) #Pooling results obtained from analyzing the datasets results <- pool(models) summary(results)
#Loading libraries #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Weighting the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm') #Analyzing the weighted datasets models <- with(weighted.datasets, WeightIt::glm_weightit(KOA ~ OSP, family = binomial)) #Pooling results obtained from analyzing the datasets results <- pool(models) summary(results)
Trims (i.e., truncates) large weights by setting all weights higher than that at a given quantile to the weight at the quantile. This can be useful in controlling extreme weights, which can reduce effective sample size by enlarging the variability of the weights.
## S3 method for class 'wimids' trim(x, at = 0, lower = FALSE, ...)
## S3 method for class 'wimids' trim(x, at = 0, lower = FALSE, ...)
x |
A |
at |
|
lower |
|
... |
Ignored. |
trim.wimids()
works by calling WeightIt::trim()
on each weightit
object stored in the models
component of the wimids
object. Because trim()
itself is not exported from MatchThem, it must be called using WeightIt::trim()
or by attaching WeightIt (i.e., running library(WeightIt)
) before use.
An object from the wimids
class, identical to the original object except with trim()
applied to each of the weightit
objects in the models
component.
Noah Greifer
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = "ATE") #Trimming the top 10% of weights in each dataset #to the 90th percentile trimmed.datasets <- trim(weighted.datasets, at = 0.9)
#Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = "ATE") #Trimming the top 10% of weights in each dataset #to the 90th percentile trimmed.datasets <- trim(weighted.datasets, at = 0.9)
weightthem()
performs weighting in the supplied multiply imputed datasets, given as mids
or amelia
objects, by running WeightIt::weightit()
on each of the multiply imputed datasets with the supplied arguments.
weightthem(formula, datasets, approach = "within", method = "glm", ...)
weightthem(formula, datasets, approach = "within", method = "glm", ...)
formula |
A |
datasets |
The datasets containing the exposure and covariates mentioned in the |
approach |
The approach used to combine information in multiply imputed datasets. Currently, |
method |
The method used to estimate weights. See |
... |
Additional arguments to be passed to |
If an amelia
object is supplied to datasets
, it will be transformed into a mids
object for further use. weightthem()
works by calling mice::complete()
on the mids
object to extract a complete dataset, and then calls WeightIt::weightit()
on each dataset, storing the output of each weightit()
call and the mids
in the output. All arguments supplied to weightthem()
except datasets
and approach
are passed directly to weightit()
. With the "across"
approach, the estimated propensity scores are averaged across imputations and re-supplied to another set of calls to weightit()
.
An object of the wimids()
(weighted multiply imputed datasets) class, which includes the supplied mids
object (or an amelia
object transformed into a mids
object if supplied) and the output of the calls to weightit()
on each multiply imputed dataset.
Farhad Pishgar and Noah Greifer
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
#1 #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = 'ATT') #2 #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- Amelia::amelia(osteoarthritis, m = 5, noms = c("SEX", "RAC", "SMK", "OSP", "KOA")) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = 'ATT')
#1 #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = 'ATT') #2 #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- Amelia::amelia(osteoarthritis, m = 5, noms = c("SEX", "RAC", "SMK", "OSP", "KOA")) #Estimating weights of observations in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm', estimand = 'ATT')
wimids
object contains data of weighted multiply imputed datasets. The wimids
object is generated by calls to the weightthem()
.
wimids
objects have methods for print()
, summary()
, and cbind()
.
The MatchThem package does not use the S4 class definitions and instead relies on the S3 list equivalents.
Farhad Pishgar
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
with()
runs a model on the n
multiply imputed datasets of the supplied mimids
or wimids
object. The typical sequence of steps to do a matching or weighting procedure on multiply imputed datasets are:
Multiply impute the missing values using the mice()
function (from the mice package) or the amelia()
function (from the Amelia package), resulting in a multiply imputed dataset (an object of the mids
or amelia
class);
Match or weight each multiply imputed dataset using matchthem()
or weightthem()
, resulting in an object of the mimids
or wimids
class;
Check the extent of balance of covariates in the datasets (using functions from the cobalt package);
Fit the statistical model of interest on each dataset by the with()
function, resulting in an object of the mimira
class; and
Pool the estimates from each model into a single set of estimates and standard errors, resulting in an object of the mimipo
class.
## S3 method for class 'mimids' with(data, expr, cluster, ...) ## S3 method for class 'wimids' with(data, expr, ...)
## S3 method for class 'mimids' with(data, expr, cluster, ...) ## S3 method for class 'wimids' with(data, expr, ...)
data |
A |
expr |
An expression (usually a call to a modeling function like |
cluster |
When a function from survey (e.g., |
... |
Additional arguments to be passed to |
with()
applies the supplied model in expr
to the (matched or weighted) multiply imputed datasets, automatically incorporating the (matching) weights when possible. The argument to expr
should be of the form glm(y ~ z, family = quasibinomial)
, for example, excluding the data or weights argument, which are automatically supplied.
Functions from the survey package, such as svyglm()
, are treated a bit differently. No svydesign
object needs to be supplied because with()
automatically constructs and supplies it with the imputed dataset and estimated weights. When cluster = TRUE
(or with()
detects that pairs should be clustered; see the cluster
argument above), pair membership is supplied to the ids
argument of svydesign()
.
After weighting using weightthem()
, glm_weightit()
should be used as the modeling function to fit generalized lienar models. It correctly produces robust standard errors that account for estimation of the weights, if possible. See WeightIt::glm_weightit()
for details. Otherwise, svyglm()
should be used rather than glm()
in order to correctly compute standard errors. For Cox models, coxph()
will produce approximately correct standard errors when used with weighting but svycoxph()
will produce more accurate standard errors when matching is used.
An object from the mimira
class containing the output of the analyses.
Farhad Pishgar and Noah Greifer
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. doi:10.18637/jss.v045.i03
#Loading libraries library(survey) #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Matching in the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'nearest') #Analyzing the matched datasets models <- with(matched.datasets, svyglm(KOA ~ OSP, family = binomial), cluster = TRUE) #Weghting in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm') #Analyzing the matched datasets models <- with(weighted.datasets, WeightIt::glm_weightit(KOA ~ OSP, family = binomial))
#Loading libraries library(survey) #Loading the dataset data(osteoarthritis) #Multiply imputing the missing values imputed.datasets <- mice::mice(osteoarthritis, m = 5) #Matching in the multiply imputed datasets matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'nearest') #Analyzing the matched datasets models <- with(matched.datasets, svyglm(KOA ~ OSP, family = binomial), cluster = TRUE) #Weghting in the multiply imputed datasets weighted.datasets <- weightthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets, approach = 'within', method = 'glm') #Analyzing the matched datasets models <- with(weighted.datasets, WeightIt::glm_weightit(KOA ~ OSP, family = binomial))