The Internet is a powerful tool that can be leveraged to explore user search behavior. Google Trends is a compelling database that tracks the frequency with which all users search any given word. There is thus an opportunity to see if the search histories obtained from Google Trends can be merged with data analytics to tease out underlying relationships with similar searches for cross-laminated timber (CLT). In this study, multiple linear regression was used to predict the search strength of the term cross laminated timber from 60 possible variables that may be directly or indirectly associated with CLT. This study was able to model the search term CLT (R2 = 0.76) using a reduced model of 20 variables. However, while prediction strength was strong, our primary interest was to statistically classify and rank important variables that might be important to CLT. To achieve this, the Mallow's Cp statistic was used to build the most robust model possible. To confirm with the literature, we also compared our study with another Web-based study and found a significant linear relationship between the t statistic in our study and the frequency of the same or similar search term in their study (R2 = 0.76). This agreement between studies helps to support our hypothesis that multiple linear regression coupled with Google Trends is a new tool that may assist marketers to identify emerging trends important to CLT.Abstract
Cross-laminated timber (CLT) gained traction in the early 1980s in Germany and is currently well used across Europe (Udele et al. 2021). CLT is considered to be more carbon friendly than competing materials such as steel, concrete, and brick (Espinoza et al. 2016, Franzini et al. 2021). Nonresidential buildings as tall as 18 stories are under current consideration or construction as society begins to accept its effectiveness against earthquakes and fire along with value attributes such as a favorable strength:weight ratio and carbon sequestration (Espinoza et al. 2016, MIT 2020). Over the past 5 years, CLT has garnered more attention in the United States, particularly in the Pacific Northwest. Figure 1 demonstrates this attention by showing the frequency of searches for CLT by state. Oregon has the highest search strength along with California and Washington. Conversely, the search strength for CLT in the southeastern United States is much less frequent and could be reflective of a lag in manufacturing and construction in this region, although recently industrial interest seems to be picking up. One of the goals of this research is to uncover terms that correlates with CLT and gauge citizen interest.
Use of big data obtained from the Internet for use in analyzing citizen behavior is gaining traction. For example, Thomas et al. (2020) used an internet Web crawler to investigate potential variables most associated with the term CLT. More recently, it was demonstrated that the lumber-futures price index could be predicted on a daily basis from Google Trends data (He et al. 2022). In the automotive industry, it was shown that the “live” automotive index could be used to predict real-time sales (Carrière-Swallow and Labbé 2013). In short, being able to monitor real-time data, such as that provided by Google Trends, can help policy makers or industry cohorts better navigate the business platform under dynamic conditions. This practice is called nowcasting, which is the prediction of the present with currently emerging data (He et al. 2022). In this paper, we considered daily acquired historical data and not real-time modeling. We then built models predicting the search strength of the term CLT from various variables or terms that are thought to possibly associate with CLT. Multiple linear regression models were then built to help identify and rank important variables that are successful in the prediction of CLT.
Materials and Methods
The search strength of the term CLT was obtained from the Google Trends database from January 2015 to April 2020 for a total of 151 sample points. CLT time-matched data were also downloaded from Google Trends for the variables outlined in the Appendix. It is important to note that Google Trends will standardize each variable to a scale of 0 to 100 to represent the interest over time. Data from the entire United States were used for modeling.
SAS (version. 9.4; SAS Institute Inc., Cary, North Carolina) statistical software was used for all modeling and significance testing. Multiple linear regression was run to see whether Google Trends data could be used to predict the search strength of CLT. The PROC Reg procedure was run with the Mallow's Cp selection method to pick the best model. The variance inflation factor (VIF) was also executed to ensure the coefficients were not inflated as a result of excessive correlation between the independent variables. It was assumed the coefficients were inflated if the VIF factor was >10. A P value <0.10 was used as the threshold to determine statistical significance. The independent variables would change in P value with addition or subtraction of other variables, so it was found that a threshold of 0.10 generally gave more stable rankings.
Results and Discussion
Long-term model
A full multivariate model (60 variables) was run and then down-selected to the most robust model using Mallow's Cp as the selection criteria. The final model was reduced from 60 to 20 variables based on a P value criteria <0.1. Table 1 is presented with the most significant variable at the top (concrete) and the least significant at the bottom (LEED). Figure 2 demonstrates the actual versus predictive performance of the model.
The term concrete was most related to the search term CLT (Table 1). Concrete is known to be preferred by the tall building industry for its low cost of construction, cost of maintenance, and fire-retardant nature; however, wood multistory buildings are gaining traction with civil servants and land use planners as they learn about wood resistance to fire through charring (Babrauskas 2005). It has been shown that the charring of wood offers protection, even at a low moisture and across a range of wood densities (Janssens 2004). Concrete is also weaker in tension than wood and reinforcement rods, such as those made of steel, are often needed to protect against earthquakes. In contrast, wood has a plastic response to earthquake-type forces because the lignin in the S1–S3 layer of the cell wall helps to provide plasticity against these sudden forces (Via et al. 2009). As such, CLT has a seismic advantage over concrete.
Hybrid composites was the second most statistically significant key word associated with CLT (Table 1). Hybrid composites in the CLT literature has been defined as a combination of CLT and another material, such as concrete, to provide the improved tensile strength of wood with the higher compression strength of concrete (Mai et al. 2018). Wang et al. (2018) defines a hybrid composite as the incorporation of wood composites as a laminate in CLT. As an example, a CLT member in a steel frame was used to combine the ductility and strength of steel with the high strength-to-weight ratio of wood (Dickof et al. 2012). In this study, the vibration (P = 0.0561) and acoustics (P = 0.003) was important in predicting the search frequency of CLT. Seismic vibration testing has been shown to be useful for the simulation of earthquakes to ensure tall timber (P = 0.0001) buildings can survive the event and absorb these short-term forces. The International Building Code has minimum requirements for sound insulation and wood has been shown to be useful in acoustic function (IBC 2021).
Architecture was another key word heavily associated with CLT (Table 1). Architecture is gaining momentum in the forest products industry in both existing and new structures (Franzini et al. 2021). CLT is an emerging material for architecture students as they begin to care about the environment and contribute to the circular bioeconomy.
New product development opportunities
Durability was an important key word to help in the prediction of CLT search strength (Table 1). Durability is the resiliency of wood against the interaction of water with biological agents; although, ultraviolet (UV) light is gaining traction as an important durability topic. In our discussion with wood preservative treatment companies, the use of wood preservatives in CLT is highly needed. This was confirmed by Udele et al. (2021), who cautioned that wood preservatives with significant volatile organic compounds emissions could be harmful to human health.
Degradation concerns are especially prevalent in the southeastern United States because of favorable moisture and temperature. For example, when using CLT in tropical locations, there is a concern that microorganisms such as fungi and insects (termites) can degrade the wood under favorable moisture and temperature conditions (Oliveira et al. 2018). To date, perhaps durability has not been appropriately addressed because CLT buildings have been concentrated primarily in Europe, Canada, and the US Pacific Northwest.
According to a Pearson correlation coefficient analysis, UV was highly correlated to durability (P <0.0001). Thus, there may be significant interest in UV coatings for the protection of CLT from outdoor environments. In US construction, CLT can be exposed to UV and moisture from rain before the roof is erected (Schmidt et al. 2019). However, UV is more of a long-term weathering event in which photo-oxidation occurs at the wood surface because of UV radiation. They point out that in the southern United States, UV coatings such as water repellents, paints, stains, or varnishes are common treatments for wood substrates. In our study, polymer films were found to be important (P = 0.014) to CLT and may be a new avenue for UV coating research.
Moving into the future, nanotechnology will be important for UV coatings, adhesives, and general durability (P = 0.0018) in materials such as CLT. For this study, polymer films were important (P = 0.0140) to CLT and could be combined with nanotechnology for new product development. Often the philosophy with nanotechnology is to break down the material to the nanoscale to concentrate and elevate a particular property of interest within the coating, polymer, or composite. Commonly, 0.5 to 5 percent weight application is used to enhance substrate properties without adding too much cost (Via and Peresin 2020). In the case of using nanocellulose for improved strength, the costs of the wood composite can actually be lowered even though the cost of nanocellulose may potentially be much higher than pulp (Via and Peresin 2020).
Connection systems may be another development opportunity for CLT (P = 0.0043). Common connection properties include withdrawal, lateral nail resistance, and dowel bearing strength (Sinha and Avila 2014). In the Southeast, it is likely that we will need to test for connection reliability on a fairly new resource: juvenile wood. Loblolly pine (Pinus taeda) is the key plantation species grown in the Southeast and is part of the southern yellow pine (SYP) group that companies use within our region (Hindman and Golden 2020). However, SYP is harvested at a much younger age, which results in a higher microfibril angle, lower density, higher lignin, and lower cellulose (Essien et al. 2017). The combination of lower quality fiber morphology and wood polymer chemistry results in a robust reduction in tissue stiffness and can lead to more vibration in forest products. Increased vibration and deflection values could limit the ability to reach desired spans when using CLT for housing and offices (Baño et al. 2016); unless, the thickness of the CLT timbers overcomes vibration issues. Hindman and Golden (2020) point out that SYP must be examined for their acoustical response in order to be accepted by building codes. A similar concern was echoed by Azambuja et al. (2022), who pointed out that the nondestructive vibration technique used to estimate stiffness did not statistically provide an equivalent yield when compared with visual grading of yellow poplar used to make CLT.
Adhesives are another type of connection that may be important to CLT (P = 0.0208). In SYP, prevention of adhesive delamination will be an important consideration because of the high longitudinal shrinkage potential along the grain of juvenile wood lumber (Ying et al. 1994). Abnormal shrinkage or swelling in this plane could result in additional forces at the glueline, thus resulting in delamination failure. Although not attributable to juvenile wood, delamination in CLT has already occurred in the United States for Douglas-fir (Pseudotsuga menziesii; Riggio et al. 2019). Delamination of the adhesive bondline can also occur if elevated fire temperatures can reach the bondline before the wood is able to self-insulate with char (Zelinka et al. 2019). They demonstrate that the best adhesive for this scenario would be melamine formaldehyde and phenol–resorcinol formaldehyde, which maintained wood failures as high as 260°C. In this study, the key word fire was interesting to those searching for CLT (P = 0.0052). In the literature, CLT has been shown to outperform steel at higher temperatures (Asdrubali et al. 2017). In general, the most common adhesives used in CLT are polyurethanes, melamine formaldehyde, and phenol–resorcinol formaldehyde (Zelinka et al. 2019). Emuylsion–polymer–isocyanate adhesives (EPI) are also allowed, which have superior moisture resistance, are fast-curing at room temperature, and exhibit low creep during long-term loading (Grøstad and Pedersen 2010). Creep was found to possibly be important to CLT search strength at the 90 percent level in this study.
Modeling during COVID-19
Understanding the impact of COVID-19 was not the intent of this study; however, COVID-19 emerged just as we were testing model validity. So we include this analysis, which may be of interest to the readers.
During the short COVID-19 period covered in this study, prediction of CLT search strength in real time accounted for nearly one-half of the variance when only four variables were used: hybrid composite, acoustic, code, and lumber strength (Table 2). Similar to the long-term model, the hybrid composite term exhibited the greatest significance to the search term CLT.
An investigation of heavily searched Web pages during that COVID-19 time frame may yield some insight as to why these four variables were important. For example, a highly searched Web site revealed a blog with the opinion that lightweight wood can be used in conjunction with steel and concrete to make hybrid composites and improve building sustainability (Valipour 2020). Likewise, Waugh Thistleton Architects designed a building using steel columns and a cellular beam frame with CLT floors to make hybrid composites (Grimes 2020). Urech et al. (2020) posted a story demonstrating a timber-composite system with a concrete slab on top of a wooden timber element. They stress that the wood allows for additional tensile strength while reducing the weight of the structure. They also were targeting a lower carbon footprint, which helps with the life-cycle assessment (LCA) of the system. LCA was deemed important to CLT search strength in this study (P = 0.0269). Asdrubali et al. (2017) discusses that LCA helps to ensure the environment, resource, and energy sustainability is maintained beyond some breakeven point. Compared with steel and concrete, Asdrubali showed that wood has a better performance in several LCA outputs including smog generation, ozone, global warming potential, and petroleum consumption.
During the COVID outbreak, the key word acoustics rose in significance in comparison with the long-term model, suggesting a possible increase in societal interest in acoustics (Table 2). A look at Google Scholar during this timeframe yielded many Web sites from architecture and wood construction manufacturing in which CLT was promoted as a way to improve sound control. Likewise, Peters and Daniels (2020) suggested that CLT could help to scatter sound in buildings. They propose to carve geometric patterns in wood to scatter sound, which helps with acoustic performance of rooms. Wood is a light material with a lower density than other materials; therefore, its sound insulation is not very good but densifying the wood helps to better reflect sound and is used in music halls (Asdrubali et al. 2017).
During the pandemic, there was also heightened interest in the key word code as associated with the search term CLT (P = 0.0004). The increased interest in code is probably attributable to the anticipated update of the US International Building Code (IBC). One well-searched Web site during this time frame pointed out there were 14 proposed code changes for 2021 IBC revision (Hunt 2020), which would remove hurdles to the use of CLT and other mass timber products for tall buildings.
Literature support for the long-term model
The study by Thomas et al. (2020) used a Web crawler to search Web sites with the term CLT and then backtracked the frequency of other terms used in the same Web page. The idea was to use the frequency of these terms as an indication of their importance to CLT. Figure 3 demonstrates the terms from their study that were also investigated in our study, using either exactly the same term or a synonym to our key words. There was a significant linear relationship between the t values of our model and the frequency of key words associated with CLT in Thomas et al. (2020). In the instance of a synonym, we listed both terms with ours being first and then the Thomas et al. term listed after the comma. It should be noted that our data analysis partially overlapped the same period of data collection as that of Thomas et al., resulting in similar rankings in variables and their importance to CLT. However, when we ran a new model between April 2020 and December 2022 (model not shown), we found that the rankings and statistical significance changed. This suggests that correlations between key words and CLT shift with time, and thus the rankings in this study may not reflect future societal behavior. For future studies, we recommend further validation of modeling with Google Trends using our method, in an effort to disprove the hypothesis that we were just “data fitting” random patterns between variables. We were concerned about this possibility and therefore encourage future studies to continue to test this hypothesis.
Conclusions
This work tested the hypothesis that Google Trends represents the thoughts of the community and those underlying relationships that may provide insight as to what variables are important to CLT. It was found that multiple linear regression could be used to relate 20 variables to the search frequency of the term CLT (R2 = 0.76). Of all the variables, concrete, hybrid composites, architecture or design, construction, UV, and tall timber ranked the highest among users who also searched for CLT. Future work should help to continue to validate, or find limitations to, this new technique of using big data from the Internet for predictive purposes.
Contributor Notes
The authors are, respectively, Director and Regions Bank Professor, Forest Products Development Center, College of Forestry, Wildlife and Environ., Auburn Univ., Auburn, Alabama (brianvia@auburn.edu [corresponding author]); Assistant Professor, Univ. of Arkansas, Fayetteville, Arkansas (davidk@uark.edu); and Assoc. Professor, Forest Products Development Center, College of Forestry, Wildlife and Environ., Auburn University, Auburn, Alabama (soledad.peresin@auburn.edu). This paper was received for publication in September 2022. Article no. 22-00057.