First Advisor

Shafie, Khalil

Document Type

Dissertation

Date Created

8-2017

Abstract

Functional Data Analysis (FDA) has attracted substantial attention for the last two decades. Within FDA, classifying curves into two or more categories is consistently of interest to scientists, but multi-class prediction within FDA is challenged in that most classification tools have been limited to binary response applications. The functional logistic regression (FLR) model was developed to forecast a binary response variable in the functional case. In this study, a functional nominal multinomial logit regression (F-NM-LR) model was developed that shifts the FLR model into a multiple logit model. However, the model generates inaccurate parameter function estimates due to multicollinearity in the design matrix. A generalized partial least squares (GPLS) approach with cubic B-spline basis expansions was developed to address the multicollinearity and high dimensionality problems that preclude accurate estimates and curve discrimination with the F-NM-LR model. The GPLS method extends partial least squares (PLS) and improves upon current methodology by introducing a component selection criterion that reconstructs the parameter function with fewer predictors. The GPLS regression estimates are derived via Iteratively ReWeighted Partial Least Squares (IRWPLS), defining a set of uncorrelated latent variables to use as predictors for the F-GPLS-NM-LR model. This methodology was compared to the classic alternative estimation method of principal component regression (PCR) in a simulation study. The performance of the proposed methodology was tested via simulations and applications on a spectrometric dataset. The results indicate that the GPLS method performs well in multi-class prediction with respect to the F-NM-LR model. The main difference between the two approaches was that PCR usually requires more components than GPLS to achieve similar accuracy of parameter function estimates of the F-GPLS-NM-LR model. The results of this research imply that the GPLS method is preferable to the F-NM-LR model, and it is a useful contribution to FDA techniques. This method may be particularly appropriate for practical situations where accurate prediction of a response variable with fewer components is a priority.

Keywords

Mathematical statistics; Multivariate analysis

Extent

230 pages

Local Identifiers

albaqshi_unco_0161D-10577

Rights Statement

Copyright is held by the author.

Share

COinS