New Pretreatment Methods for Visible–Near-Infrared Calibration Modeling of Air-Dry Density of Ulmus pumila Wood
Abstract
Due to the multidimensional complexity and redundancy between wavelengths in the visible and near infrared (Vis-NIR) region, the speed and accuracy of data analysis can be affected. This study aims to investigate the feasibility of simplifying high dimensional data based on transformation of the spectra and local correlation maximization (LCM). These two methods will be applied to determine the prediction accuracy of air-dry density of Ulmus pumila wood. In this study, the reflectance spectra (Refl.) were subjected to the reciprocal (1/Refl.) and logarithm reflectance to improve the spectra signal for prediction. LCM was developed for selecting spectral sensitive regions that were important in the prediction of density. A local correlation coefficient (r) criterion was developed such that if the r ≥ 0.75 (between wavelength and density), then partial least squares and support vector machine (SVM) were employed as the prediction method. Likewise, 2D correlation spectroscopy plots were used to further reduce the data matrix by removing redundant wavelengths. The results showed that (1) although the sensitive region of density was different, the region of r ≥ 0.80 was mainly in the Vis and NIR spectral region. Additionally, the performance of models developed from the sensitive region was better than that of data used from the less-sensitive region. (2) The SVM model was optimized by a genetic algorithm based on the log (1/Refl.) of the sensitive region. In conclusion, it was found that the spectral transformation presented better density estimation results ( = 0.909, root mean square error of calibration = 0.014) than when less sensitive wavelengths were used in the data matrix.

2D correlation coefficient plot between different spectra of transformations and corresponding wavelength variables. (A) Refl. spectral data; (B) 1/Refl. spectral data; (C) log(Refl.) spectral data; (D) log(1/Refl.) spectral data.

2D local correlation coefficient plot between different spectra of transformations and air-dry density.

Sensitive region of air-dry density for different spectra of transformations.

The iterative fitness trend of the genetic algorithm–optimized support vector machine for searching the optimization parameter.

The results of genetic algorithm-optimized support vector machine model for sensitive region (r ≥ 0.75). RMSEP is root mean square error of prediction; SEP is standard error of prediction; MAPE is mean absolute percentage error; RPD is residual predictive deviation.
Contributor Notes
The authors are, respectively, Graduate Researcher, College of Engineering and Technol., Northeast Forestry Univ., Harbin, China (yingli@nefu.edu.cn); Director and Research Fellow, Forest Products Development Center, SFWS, Auburn Univ., Auburn, Alabama (brianvia@auburn.edu, qzc0007@auburn.edu); and Graduate Researcher and Professor, College of Engineering and Technol., Northeast Forestry Univ., Harbin, China (1874670424@qq.com, yaoxiangli@nefu.edu.cn [corresponding author]). This paper was received for publication in January 2019. Article no. 19-00004.