Date of Award
Master of Science (MS)
In the statistical analysis of a health research study, it is quite common to have some missing data after data collection. Typically in a clinical trial, the treatment variable is completely recorded most of the time, but the associated covariates may not be. The multi-variable analysis is often conducted by including all the important medically relevant covariates with the expectation that a valid estimate of the treatment effect could be obtained by properly adjusting for these covariates . In this scenario, if the data of the covariates are Missing Not at Random (MNAR) , the situation becomes complicated. The estimate of the treatment effect obtained will be invalid. The situation when the data are Missing Completely at Random (MCAR) is interesting since a dilemma exists: if you include the covariates with a high missing proportion, the analysis loses power although the validity might be good. If the covariates with a high missing proportion are excluded, the validity might be of question but the precision is good. Although the literature suggests that the validity is more important, there might be cases where the precision would improve substantially with a little sacrifice on validity by omitting the covariate from the analysis. In this t hesis, this dilemma will be evaluated in the context of multivariable logistic regression with the hope that some of the results from this work would shed light on the understanding of the situation. This work is significant in that it could potentially change the data collection process. For example, in the research design stage, if we expect that a covariate would have a high rate of missingness, there might be little to gain by collecting this information. Furthermore, the results from this work may guide decisions about data collection. If we decide that a covariate does not need to be collected, then the relevant resources could be released to apply to other important aspects of a study.
Zhao, Kai, "Bias and Efficiency of Logistic Regression involving a Binary Covariate with Missing Observations" (2010). Open Access Dissertations and Theses. Paper 4139.
McMaster University Library