Composite sampling is standard practice for evaluation of preservative retention levels in preservative-treated wood. Current protocols provide an average retention value but no estimate of uncertainty. Here we describe a statistical method for calculating uncertainty estimates using the standard sampling regime with minimal additional chemical analysis. This tool can be used by wood treaters to generate lower prediction intervals that with a certain level of confidence give lower bounds on preservative retention estimates for a future analysis (e.g., by a customer) indicating whether the treated wood charge would result in a below-target retention value.Abstract
Standard practice in commercial wood treatment operations is to evaluate the quality of treated wood by extracting and analyzing a number of wood cores after the treatment process is complete. For example, AWPA T1 (American Wood Protection Association [AWPA] 2012a), a commonly used standard in the industry, specifies wood species, preservative types, commodity details, and three additional factors:
-
assay zone, the analysis zone that extends from the surface into the wood 5 to 100 mm, depending on the type of wood product (commodity) and treatment type; the assay zone is cut from the cores after they are removed from the treated wood;
-
penetration, the distance from the surface of the wood that the chemical is present in the assay zone; and
-
retention, the amount of chemical that is present in the assay zone.
The usual number of cores specified per charge (treatment batch) is 20, although the number can be greater for some commodities or for wood treated with creosote. Each core is evaluated separately (visually, often with the aid of a colorimetric indicator) for preservative penetration; however, the cores are normally pooled for a single preservative retention analysis. The individual wood cores are combined and milled, and this “composite” sample is analyzed for preservative components using specified chemical analysis techniques, such as X-ray fluorescence (XRF) for heavy metals such as copper (AWPA 2012b).
The compositing of the small-diameter wood cores is convenient because it results in an ample amount of wood for the chemical retention analysis and because it provides an average value that can be compared with the minimum retention specifications listed in the relevant standard. However, as a single measurement, no information is provided about the variability in preservative retention that exists from core to core, and thus no insight is gained on within-charge treatment variation or, for example, the likelihood that another sampling of the charge would yield a value lower or higher than the specification. So, for example, if a charge has a true mean retention near the specification level, but has large underlying variation, then a future composite sample from that charge has a greater chance of being below the specification than if the charge has smaller underlying variation.
Compositing is a common practice in fields where chemical analysis may be expensive or complex. A composite sample may be more convenient and cost-effective, especially if properly taken and analyzed correctly. In the food industry, compositing can be useful for the determination of nutritional information for food product labeling. The Food and Drug Administration has issued a manual on nutritional labeling that provides guidance and recommendations on the details of the practice for obtaining reliable nutritional data (US Food and Drug Administration 1998). Compositing is widely used in environmental studies and assessments, and its statistical advantages and disadvantages have been discussed in detail (Edland and van Belle 1994, Patil et al. 2011). Compositing may be used in bulk sampling as part of an acceptance sampling program for assessing quality of continuous materials, as discussed in Schilling and Neubauer (2009).
Use and interpretation of a composite assay for determination of a retention value for a charge of lumber has been a part of the AWPA standards since the 1960s and appears to have been adopted after its use for poles and piles in an effort to convert quality control standards to “result-type specifications” (AWPA 1957, Baechler 1960, Sherman 1961). Baechler et al. (1962) and Baechler (1962) reported on a study of the feasibility of adopting similar specifications for treated lumber; they discussed many of the factors that can influence the gradient of treatment retention and the relationships between crosscut zone sample measurements and boring measurements for several types of preservative treatments of southern pine and Douglas-fir lumber. These studies and subsequent American Wood-Preservers' Association committee discussions led to the 1966 adoption of a composite assay sample to determine whether the retention value of a charge was sufficient to meet the standard specification. At that time, “samples to be taken from not less than 20 pieces in a lot” became the recommendation in the standard. Since that time the composite assay value for a particular charge of lumber has been used to determine quality compliance both internally by treating plants and externally by third-party testing agencies (AWPA 1963, 1964, 1965, 1966).
More recently, Kleinknecht (1999) discussed in detail the different types of variability that can influence treated wood penetration and retention values and their impact on quality testing. Lebow and Conklin (2012) further discussed the statistical aspects of compositing on interpreting wood retention assay values. The use and interpretation of a composite retention value for understanding charge quality within the AWPA standards has seen recent activity (AWPA 2015). Although additional effort would be required, creating multiple composite samples from a single charge would provide a measure of variability. Statistical analysis will provide a greater level of confidence that the true retention of a charge meets the specification. This may reduce the number of samples for subsequent inspections of the charge.
Statistical techniques exist that can estimate variability of a population based on the values of a few composite subsamples (Edland and van Belle 1994, Patil et al. 2011). The purpose of this study was to develop a statistical tool for wood-treating plant operators that would allow them to estimate treatment retention variability within a charge and, with minimal extra work, construct a lower bound prediction interval for a future observed composite mean from that charge. Specifically, by separately pooling, grinding, and analyzing a few groups from the normal 20 cores, they could determine a lower prediction limit for a future analysis of the charge in addition to the usual composite average sample value. Treatment standards do not specify an upper limit on retention, but overtreatment is generally avoided because it increases costs and does not necessarily result in a commensurate increase in durability.
Hahn and Meeker (1991) discuss different types of statistical intervals, underlying assumptions, calculation, and interpretation of statistical intervals for practical questions. They discuss how prediction intervals are constructed to predict future observations and statistics from a population (e.g., a mean) with a certain level of confidence. They further describe confidence intervals for describing population parameters and tolerance intervals that describe other population characteristics. The underlying distribution of a population is a key assumption that impacts the performance of the intervals. Gibbons (1994) discusses goodness-of-fit tests and gives further details on prediction intervals for noncomposited measurements, including nonnormal populations. The Shapiro-Wilk goodness-of-fit test for normality is a formal statistical test whose rejection indicates that the normal distribution may not be a suitable distribution in describing a sample, although it does tend to reject normality if sample size is too high. Kolmogorov-Smirnov is a general goodness-of-fit test that looks at the distance between an empirical distribution function based on a sample and a hypothetical distribution function. It is not as powerful as other tests, and larger sample sizes are recommended. The Anderson-Darling test is a refinement of the Kolmogorov-Smirnov test that gives more weight to the tails of the distribution and is considered more powerful than the Kolmogorov-Smirnov test. For detailed descriptions of goodness-of-fit tests, see National Institute of Standards and Technology (NIST 2012), sections 1.3.5.14, 1.3.5.16, and 7.2.1.3.
The purpose of this study was to evaluate the improved statistical procedure for assessing treatment quality.
Materials and Methods
Analysis of commercially treated wood samples
Thirty pressure-treated 8-foot (2.44-m)-long lumber from the Southern Pine species group of the southeastern United States (primarily Pinus taeda, loblolly pine) nominal 2 by 4s were purchased from each of two different retailers in Knoxville, Tennessee, in the summer of 2012 (60 total boards). Each group was chosen from a single stack of lumber, with the assumption that the lumber came from a single treatment charge. Two different preservative treatments were sampled: soluble copper azole (listed by AWPA as CA-C; AWPA 2012c) and micronized copper azole, with specified retentions shown on the end tags of 0.06 and 0.06 lb/ft3 (0.96 and 0.96 kg/m3), respectively. Copper azole is a common wood preservative system that contains mostly copper, either in solution or as a suspension of small particles in the case of “micronized” copper azole.
A 3-cm-thick cross section was cut from the center of each piece of lumber. The cross section was sprayed with Chrome Azurol S solution, which reacts with copper to give a blue color and allows for a visual estimate of preservative penetration (AWPA 2012d).
A 25-mm-long (parallel to the grain or length of the board) by 25-mm-wide by 15-mm-deep section was cut from the narrow face (“the edge”) of each lumber sample for preservative retention analysis. This was adjacent to the cross section analyzed for preservative penetration. Each sample was separately milled to pass through a 30 mesh screen in a Wiley mill and analyzed for copper retention using XRF (AWPA 2012b). The individual sections provided enough sample to run the standard retention analysis on a single sample. This assumes that a single 1-in2 section represented retention in the same way as a single, smaller core does. It is recognized that this assumption may not be precisely true, but it was required because increment cores were too small to analyze individually, and it allowed the statistical evaluation of the worksheet tool.
Statistical analysis of wood samples
Descriptive statistics, including mean, median, standard deviation, and coefficient of variation, and exploratory distributional plots (box-and-whisker plots and normal probability plots) were calculated on the individual samples for each treatment group, and probability values (P values) of univariate goodness-of-fit tests for normality were evaluated (Shapiro-Wilk, Anderson-Darling, Kolmogorov-Smirnov; Table 1). Corrected Akaike's information criteria (AICc) is a small-sample, information-based measure commonly used for model selection and includes determining differences between hypothesized probability distributions (Burnham and Anderson 2002). Lower values of AICc among the hypothesized distributions are an indication of the best fit distribution and were compared for other candidate distributions in Table 2. The wood sample data were then used for testing and evaluating the proposed statistical tool.
Statistical tool
When sampling treated wood retention levels according to the AWPA standard (AWPA 2012a), 20 cores are composited (m = 20) to produce one retention value (Ŷm), which is often reported on charge reports. These same cores can be grouped into n separate groups each of k cores, such that m = n × k, with only a small amount of extra handling to ensure there is sufficient material in each group so as to not introduce machine measurement error (such as, four groups of five cores each, or five groups of four cores each). Each of the n groups would be milled and analyzed separately to give retention values, Y1, . . . , Yn. The average of Y1, . . . , Yn would be identical to Ŷm and could be reported as usual.
However, additional information is provided by separately analyzing n groups of k samples. The standard deviation of the samples can be calculated as the square root of the variance
From this, a prediction lower limit for a future sampling can be calculated as
where x = (1 − α)/100 percent and
A lower confidence bound's coverage is determined as the probability that the bounded interval contains the true parameter of interest. So a nominal coverage of 0.95 means that 95 percent of lower confidence bounds will actually contain the true parameter. Simulations were used to illustrate and estimate coverage probabilities. Data from the treated wood samples were randomly sampled (with replacement) to provide n groups of k samples, such that m = n × k = 20, as described above. A 95 percent prediction lower limit was calculated, random samples of 20 values from the original data were then taken and averaged to provide Ŷ, and this process was repeated 10,000 times. This allowed the evaluation of the method in the context of distributions with the exact characteristics of our observed samples. The tool (method) was further evaluated based on simulations of samples from normal theoretical distributions with the estimated sample characteristics as the normal population parameters. Actual coverage probabilities from the simulations are given in Table 3. Example R-code is available from the authors.
The spreadsheet tool does allow for some flexibility for situations where the testing may be on a different number of cores than is expected to be used in the future composite (i.e., k × n ≠ m). In this case, the formulas are the same. The spreadsheet is available from the authors.
Results and Discussion
Copper penetration in the treated lumber was generally good for the lumber treated with copper azole and micronized copper azole, except in areas where impermeable heartwood or knots were close to the surface (Fig. 1). AWPA Standard T1 (AWPA 2012a) requires penetration for southern pine 2 by 4 lumber to be a minimum of 63 mm (2.5 in.) or 85 percent of the sapwood depth in at least 80 percent of the pieces sampled. Penetration of heartwood is not required.
Copper retention was close to the specified value on average but variable (Fig. 2; Table 1). With the exception of micronized copper, these data suggest normal distributions are reasonable assumptions for preservative retention (Fig. 3; Tables 1 and 2). Table 2 provides further information on the ability of other statistical distributions to characterize the retentions as measured by AICc. Neither the normal, lognormal, Weibull, nor gamma distributions provided good fits for the micronized copper group (Anderson-Darling P values < 0.01), although based on the lowest AICc, the lognormal distribution appeared to better describe the distribution. However, removal of an outlier in the micronized copper group indicated that normality would otherwise be a reasonable assumption for this group. The lognormal distribution did not provide a good fit for the other data, but among the selected distributions, it appeared to better characterize the micronized copper group when including the outlier. If an outlier is suspected and the cause for its difference from other observations cannot be determined, it is generally recommended to include the observation but to evaluate procedures both with and without the outlier to determine the implications of its inclusion/exclusion. See NIST (2012), section 1.3.5.17, “Detection of Outliers,” for a detailed discussion on outliers.
In theory, under the assumption of normality and because samples are not exhaustive, an average value that is equal to the target value means that half the samples (cores, in the case of treated wood sampling) would typically be below the target, but because the standard is based on the average value for the composite sample, this variability is not of concern directly to the treater; a composite sample value at or above the specification would “pass.” However, theory also suggests that a random resampling (e.g., by a customer) of the same charge will, by chance, produce a value below the target some of the time because of variation among the individual poles or lumber pieces within the charge and the limited sampling process. This suggests that looking at a lower bound of a prediction interval for a single future composite value, as given above by Px, which with some degree of confidence will contain the future composite value, would be of interest to a treater.
We developed a spreadsheet that automates the calculation of this lower bound of prediction when cores have been composited in groups by the treater (see Fig. 4 for an example screen shot). For example, a certain number of cores (20 in this case; cell C1) are composited into four separate groups (cell C2), which are then analyzed separately (e.g., if the cores were obtained randomly, then cores 1 to 5 go into one sample, cores 6 to 10 go into another sample, cores 11 to 15 go into another sample, and cores 16 to 20 go into the last sample). The user enters the retention values for each composited sample (cells C3 to C6). Background calculations then provide the average retention values and the calculated lower prediction bounds. The total number of samples (cores) taken and the number of composite groups used can be adjusted by the user as long as the appropriate values are entered in the spreadsheet.
Random sampling of 20 retention data values from our observed assay samples produced a composite average value below the lower prediction bound only 6, 2, and 4 percent of the time for copper azole, micronized copper azole, and micronized copper azole (without the outlier), respectively (Table 3). If normal distributions are assumed, then the intervals calculated by the statistical tool had the proper coverage of 95 percent for the treated wood sampled.
Conclusions
Statistical tools for composite sampling can be used to assist treated wood producers in estimating within-charge variability in preservative retention. This tool can help treaters to evaluate the risk that a customer's sampling of a treated wood batch would yield a retention value that is below the required level. The tool can be adapted to variations in sampling intensity, and the use of the tool requires minimal extra analytical work.
Contributor Notes
The authors are, respectively, Mathematical Statistician, USDA Forest Serv., Forest Products Lab., Madison, Wisconsin (plebow@fs.fed.us [corresponding author]); Associate Professor and Professor, Univ. of Tennessee, Knoxville (mtaylo29@utk.edu, tmyoung1@utk.edu). This paper was received for publication in September 2014. Article no. 14‐00092.