Loading... SOLAS 4.0 New Features
Your Cart
Mahalanobis Distance Matching Method
The Mahalanobis distance is a metric that can be used to measure the dissimilarity between two vectors. In this case, the vectors will be cases from the dataset and they will be composed of the values from the covariates specified for the calculation.
Generation of Imputations
Consider that represents the vector for the case containing the missing value and
is a complete case. The distance between these is calculated as follows:
where S is the covariance matrix.
Each missing value from the imputation variable y is imputed by values randomly drawn from a subset of observed values, i.e. its donor pool, with the shortest Mahalanobis distance to the missing data entry that is to be imputed. The Donor Pool defines a set of cases with observed values for that imputation variable.
Defining Donor Pools Based on Mahalanobis Distances
The Donor Pool page gives the user control over the random draw step in the analysis. You are able to set the subset ranges and refine these ranges further using another variable known as the Refinement Variable that is described below.
Two ways of defining the Donor Pool sub-classes are provided:
- You can use the subset of c cases that are closest with respect to Mahalanobis distance. This option allows you to specify the number of cases that are to be included in the sub-class. The default c will be 10 and cannot be set to a value less than 2. If less than 2 cases are available, a value of 5 will be used for c.
- You can use the subset of d% of the cases that are closest with respect to Mahalanobis distance. This is the percentage of “closest” cases in the data set to be included in the sub-class. The default for d will be 10.00 and cannot be set to a value that will result in less than 2 cases being available. If less than 2 cases are available, a d value of 5 will be used.
RSS