Advisor
Yu, Han
Committee Member
Shafie, Khalil
Committee Member
Merchant, William
Committee Member
Romulo, Chelsie
Department
College of Educational and Behavioral Sciences, Department of Applied Statistics and Research Methods
Institution
University of Northern Colorado
Type of Resources
Text
Place of Publication
Greeley, (Colo.)
Publisher
University of Northern Colorado
Date Created
5-2022
Extent
149 pages
Digital Origin
Born digital
Abstract
High-dimensional data are increasingly popular in various physical science and social science disciplines. This study proposed a new computationally efficient sample splitting method called Neighborhood-Based Cross Fitting (NBCF) for double machine learning in causal inference on high-dimensional data. A common existing approach of repeatedly splitting data was suggested to address the overfitting problem in high-dimensional statistics, however it is computationally expensive. The proposed method deals well with the problem of post-selection bias in causal inference in the presence of high-dimensional confounders. Also, it provides an equivalent performance in unbiased estimation as repeated data splitting, which is suggested to expand the scope of function class by Donsker. Simulation studies were conducted to demonstrate that the proposed NBCF approach is not only more computationally efficient than the existing sample splitting methods, but also better in bias reduction compared with other existing methods. Under certain conditions, simulation results further showed that the proposed estimators are consistent, asymptotically unbiased, and normally distributed, which allows construction of valid confidence intervals. The practical application of NBCF was illustrated with a real dataset.
Degree type
PhD
Degree Name
Doctoral
Local Identifiers
Agboola_unco_0161D_11005.pdf
Rights Statement
Copyright is held by the author.