Hutchinson, Susan R.

Committee Member

Tsai, Chia-Lin

Committee Member

Yu, Han

Committee Member

Reardon, James


College of Education and Behavioral Sciences; Department of Applied Statistics and Research Methods


University of Northern Colorado

Type of Resources


Place of Publication

Greeley, (Colo.)


University of Northern Colorado

Date Created



206 pages

Digital Origin

Born digital


There has been increasing interest in the use of Bayesian growth modeling in machine learning environment to answer the questions relating to the patterns of change in trends of social and human behavior in longitudinal data. It is well understood that machine learning works properly with “big data,” because large sample sizes offer machines the better opportunity to “learn” the pattern/structure of data from a training data set to predict the performance in an unseen testing data set. Unfortunately, not all researchers have access to large samples and there is a lack of methodological research addressing the utility of using machine learning with longitudinal data based on small sample size. Additionally, there is limited methodological research conducted around moderation effect that priors have on other data conditions. Therefore, the purpose of the current study was to understand: (a) the interactive relationship between priors and sample sizes in longitudinal predictive modeling, (b) the interactive relationship between priors and number of waves of data, and (c) the interactive relationship between priors and the proportion of cases in the two levels of a dichotomous time-invariant predictor for Bayesian growth modeling in a machine learning environment. Monte Carlo simulation was adopted to answer assess the above aspects and data were generated based on alumni donation data from a university in the mid-Atlantic region where model parameters were set to mimic “real life” data as closely as possible. Results from the study show that although all main and interaction effects are statistically significant, only main effect of sample size, wave of data, and interaction between waves of data and sample sizes show meaningful effect size. Additionally, given the condition of prior of the study, informative priors did not show any higher prediction accuracy compared to non-informative priors. The reason behind indifferent between choices of informative and non-informative prior associated with model complexity, competition between strong informative and weakly informative prior. This study was one of the first known study to examine Bayesian estimation in the context of machine learning. Results of the current study suggest that capitalizing on the advantages offered jointly by these two modeling approaches shows promise. Although much is still unknown and in need of investigation regarding the conditions under which a combination of Bayesian modeling and machine learning affects prediction accuracy, the current dissertation provides a first step in that direction.

Degree type


Degree Name


Local Identifiers


Rights Statement

Copyright is held by the author.