Advisor

Yu, Han

Committee Member

Shafie, Khalil

Committee Member

Merchant, William

Committee Member

Romulo, Chelsie

Department

College of Educational and Behavioral Sciences, Department of Applied Statistics and Research Methods

Institution

University of Northern Colorado

Type of Resources

Text

Place of Publication

Greeley, (Colo.)

Publisher

University of Northern Colorado

Date Created

5-2022

Extent

149 pages

Digital Origin

Born digital

Abstract

High-dimensional data are increasingly popular in various physical science and social science disciplines. This study proposed a new computationally efficient sample splitting method called Neighborhood-Based Cross Fitting (NBCF) for double machine learning in causal inference on high-dimensional data. A common existing approach of repeatedly splitting data was suggested to address the overfitting problem in high-dimensional statistics, however it is computationally expensive. The proposed method deals well with the problem of post-selection bias in causal inference in the presence of high-dimensional confounders. Also, it provides an equivalent performance in unbiased estimation as repeated data splitting, which is suggested to expand the scope of function class by Donsker. Simulation studies were conducted to demonstrate that the proposed NBCF approach is not only more computationally efficient than the existing sample splitting methods, but also better in bias reduction compared with other existing methods. Under certain conditions, simulation results further showed that the proposed estimators are consistent, asymptotically unbiased, and normally distributed, which allows construction of valid confidence intervals. The practical application of NBCF was illustrated with a real dataset.

Degree type

PhD

Degree Name

Doctoral

Local Identifiers

Agboola_unco_0161D_11005.pdf

Rights Statement

Copyright is held by the author.

Share

COinS