ABSTRACT

The multivariate adaptive regression spline (MARS) is one of the alternative nonparametric approaches of the Gaussian graphical model (GGM) in order to construct the undirected networks of complex biological systems. In GGM, it is assumed that the binary form of the estimated precision matrix, computed from the systems’ elements under their conditional independencies, represents the undirected interactions of these elements by using the lasso regression. From previous studies, it has been shown that if MARS is estimated by discarding all interactions and nonlinear components, resulting in a sole inference of the main effects, it can be a strong alternative of GGM. We call this type of the summarized MARS model the lasso-based MARS model. Furthermore, in the construction of the networks via the optimal MARS model, it is assumed that the estimated regression coefficients, i.e., the important terms of the estimated model, can be the indication of the pairwise links between the species of the systems. Hereby, in this study, we investigate whether the inference of the networks via the Pearson, Spearman, and Kendall’s tau correlation coefficients has better performance than the networks constructed by the important regression coefficients in the lasso-based MARS model. In the analyses, we use both simulated and real datasets and compare the estimated networks by various accuracy measures from specificity and F-measure to Matthews correlation coefficient. From the analyses, we find that the suggested approaches improve all the listed criteria significantly without any computational cost.