Yu, Han

Committee Member

Shafie, Khalil

Committee Member

Khaledi, Bahaedin

Committee Member

Sung, Yoon Tae


College of Educational and Behavioral Sciences, Department of Applied Statistics and Research Methods


University of Northern Colorado

Type of Resources


Place of Publication

Greeley, (Colo.)


University of Northern Colorado

Date Created



147 pages

Digital Origin

Born digital


The machine-learning algorithms have gained popularity and have gotten the attention of many researchers in the fields of statistics and computer sciences in recent decades. Due to their computational capabilities in big data, many researchers have been attempting to incorporate machine-learning in prediction and inference problems. One of the recent methods that got a lot of attentions was referred to as the double machine learning method (DML). This method attempts to estimate the effect of the treatment variable in the presence of high-dimensional nuisance function by incorporating machine-learning algorithms. Previous studies have shown that the DML method is able to reduce the bias in estimating the targeted parameter when many covariates are present in the dataset. In this dissertation, a method was proposed that is referred to as the double super learner method (DSL). Since there are many machine-learning algorithms in existence today that are different in their searching strategy, there is no way to know which algorithm performs best for a given dataset. The proposed DSL method was developed in parallel with the DML method and works by incorporating several machine-learning algorithms via the super learner function. Numerical simulation was performed across various data settings in terms of the sample, the number of associated covariates, and the type of treatment variable. In comparison with the original DML method, numerical simulation showed that the proposed method achieved reduction in bias and provided valid confidence intervals in situations where the original method did not. A package called DoubleSL was then developed and made public for those who desire to use this method in their research. In addition, real-data examples were included in the package to demonstrate the use of this method.

Degree type


Degree Name


Local Identifiers


Rights Statement

Copyright is held by the author.