Joint Statistical Meetings - Section on Physical & Engineering Sciences (SPES)MODELING PERFORMANCE OF ENGINEERING CONTROLS WHEN REDUCTIONS ARE LARGESTAT THE HIGHEST ENVIRONMENTAL CONCENTRATIONS OF THE HAZARDOUS CONTAMINANT Stanley A. Shulman, Kenneth R. Mead, and R Leroy MickelsenNational Inst. for Occupational Safety and Health, 4676 Columbia Parkway, MS-R3, Cincinnati, OH 45226Key Words: Multiplicative Interaction, Regression onColumn Sums, Lognormal Environmental DataAbstractEngineering controls are often compared outdoors using Exhaust Stackrandomized pairs to determine if a new control design for Controlreduces worker exposure to airborne hazardouscontaminants when compared to an uncontrolled work these studies data were collected in randomized pairs ofenvironment. A common occurrence is that the ratio of ( control-off, control-on), each trial in each pair lasting atmeans of controlled to uncontrolled work environments least one and a half minutes. The pairs were collected overdepends upon concentration. When the uncontrolled five days of sampling. The data consisted of organicenvironment is at its highest concentrations, the compound concentrations averaged over four secondengineering control has greatest impact and the ratio is intervals. Medians were computed for each trial becausesmallest (reduction is largest). Such interaction may arise medians are much less correlated than individual readings.outdoors because of wind and other natural conditions that Deletion of data collected near transitions in paving statusvary contaminant concentrations. The following also reduced correlation. (For example, half a minute ofapproaches to model this interaction are compared (all on data was deleted in trials which were either preceded ornatural log scale): 1) model estimates reduction separately succeeded by a period of no paving that lasted for at leastfor upper 25% of uncontrolled samples and for lower 25 seconds.)75%; 2) model is based on Tukey’s one degree offreedom model for interaction, equivalent to regression of The measure of control effectiveness used is the fractioncontrol differences on pair means; 3)model regressescontrol differences on control-off values. Results based on reduction of airborne organic compound concentrations,lognormality indicate that for many situations model 3)may better describe the interaction than model 2). Since defined as:model 3) has greater power than model 1), model 3) maybe preferable. Fraction Reduction (1)1) Introduction = 1- (control-on median)/(control-off median).The data considered are outdoor data, for which the Figure 1: Fraction Reduction in Airborne Organiccomparison of interest is an uncontrolled to a controlled Con ce n tr ationenvironment. A common result is that the reduction incontaminant is greatest when the uncontrolled environment Fraction Reduction 0.9 Upper 25% Control-Off(referred to as “control-off”) is highest. This may make 0.8sense, in that the highest control-off measurements may 0.7 Lower 75% Control-Offoccur when environmental factors such as wind have the 0.6least impact. The example data here are air measurements 0.5 2.5 3 3.5 4 4.5of organic compounds on a highway paving machine. The 0.4control ventilation system draws contaminant from the 0.3 Ln(Control-Off)source at the auger, and discharges it out of the workers’ 0.2breathing zones through a vertical stack above the paver 0.1 Original concentration in parts per m illion(ppm )and the paver driver. 02) Example Data 2Since the control could be switched on or off, changingbetween the two control settings was very simple. In 3227

Joint Statistical Meetings - Section on Physical & Engineering Sciences (SPES)Thus, the smaller the ratio of control-on median to Figure 2control-off median, the closer the fraction reduction is to Ln(Con-on)-Ln(Con-off) vs1, the maximum value. 0.5[Ln(Con-on+Ln(Con-off)]; Actual Data and Fitted LineFor the example data the estimated fraction reductionsare plotted above. It is clear from the figure that there isincreasing reduction with increasing control-off values.For all models presented, the data are transformed to the 0natural log scale, since that scale is convenient 1.7for ratios, as are needed for estimating the fractionreduction. -0.53) Model 1: Upper 25% Model -1 2.7 3.7 .5*[ Ln( Con- on) +Ln( Con- of f ) ]The data are divided into two groups, based on the Ln (on) - Ln (off) -1.5values of the control-off median in the pair. Those pairswith the control-off value in the upper 25% of all -2control-off values are in one group, and the remainingpairs in a second group. The two groups are labeled in The model given in eq.(2) is appropriate for any two-Figure 1, where it can be seen that for the 25% pairs, the way design. For a design for which the control factorfraction reduction was about 0.68, compared to about has just two levels, the estimate for G may be written in0.44 for the lower 75% group. Even though the overall a simpler form:average of about 0.50 is statistically significant at the5% level, this difference in fraction reduction betweenthe two groups is large. In other situations, when theoverall average is not statistically significant, thereduction for the upper 25% pairs may be. A weaknessof this approach is that the division into two groups isbased on a somewhat arbitrary dividing point: 25%.The conclusions can differ if a different point is used.4) Model 2- Regression on Pair MeansThe trend seen in Figure 1 of increasing fraction G= Ep$p (yp,c- yp,nc)/[2 \"c Ep$2p], (3)reduction with increasing control-off values is a form ofinteraction. The model proposed by Tukey for with sample estimates in place of \"c and $p.multiplicative interaction in a two-way design allowsfor the dependence of method differences on the pair Taking differences under eq. (2), as in references ,(2,3) wemeans. This can be shown as follows. In the model ofTukey (1), the control-pair interaction is a multiplier of a obtain:factor for the pair mean and a factor for the control typemean: \" \" $Ln(yp,c)-Ln(yp,nc)=2 c + 2 c G p+(ep,c- ep,nc) (4):Ln(yp,con)= + $ \" $ \"p+ con+G p con + ep,con, p=1,2,..P, Thus, differences are linear functions of pair means.(2) Since the estimate of G is produced by substitutionwhere p=pair, con=control setting (“c” =control-on, of the sample values for the parameters in (3), and since\"“ncocn”==[(cmoenatrnofl-oorfcf)o.n$tpro=l[(cm)-e:an]; f\"ocr=p-a\"inrc;pe)-p:,co]n;~ENp($0p,=F02;). ( 2\"cG)= Ep$p (yp,c- yp,nc)/ Ep$2p, the estimate of theEstimates for the parameters are obtained by substituting slope from the linear regression (using the sample valuesthe corresponding means from the data. The estimate for in place of the $ps) can be used to obtain the value of G.E \" $G is con,p con p E \"yp,con/[ con Ep$2p], where the 2 The regression slope can be tested to determine whethersample estimates con replace the parameter \" $for con and p the value of G is zero. This test can be used, even thoughvalues. the estimate of G may be biased, since the estimates of the $ps differ from the true values because of measurement error(4) . For the example data, the differences of the log- 3228

Joint Statistical Meetings - Section on Physical & Engineering Sciences (SPES)transformed data (control-on -control-off) are shown inFigure 2, in addition to the fitted line. Themultiplicative interaction is statistically significant at the5% level.5) Model 3: Regression on Control-off 6) Why Increasing Control Effectiveness with Increasing Control-off Values?Figure 1 suggests the following modification of eq. (4). The previous sections have focused on recognition andSince the fraction reduction appears to be dependent on statistical modeling of the phenomenon. Here wethe control-off values, it seems sensible to regress the consider possible reasons for its occurrence. Twolog scale differences on the log of the true control-off possibilities to consider are:values:Ln[yp,c]-Ln[yp,nc]= (+* :p,nc+fp, (5) a) Environmental data are often lognormal(5). The consequences should be examined. Greater reductions atswchaelere,(:=p,lnicn=em’seainnteorfcecpotn,tr*o=l-solfofpfeo,rapnadir p, on log higher control-off values may relate to lognormality. Since lognormality is associated with environmental fp ~ N(0, F2f). variability, the higher control-off values could correspond to less environmental control, and statisticalIn eq. (5) measurement error is assumed small compared results which show greater reduction at higher control- off values can be interpreted to mean that reduction dueto environmental variability, in which case the model to control is greatest when environmental control is least. b*y<0li,n[eLarn(ryepg,cr)e-sLsino(nypb,nyc)r]edpelcarceinasges:wp,nicthbymay be fittedLn[yp,nc](4). Ifincreasing control-off. (yp,c/yp,nc decreases.) b) What characteristics of the measurement process itself could lead to this phenomenon? One possibility isFor the example data, shown with the fitted line in the presence of background levels, which are difficult toFigure 3, the estimated slope is significant at the 5% estimate. The idea is that when the control-off valueslevel. are high, they are far above background, and the background would, therefore, have little effect on the Figure 3 denominator for the ratio of control-on to control-off. Ln(Con-on)-Ln(Con-off) vs Ln(Con-off) Actual Data and Fitted Line Only a) will be examined here; b) will be investigated in a future study. 7) Lognormality as an Explanation of the Phenomenon 0 2.7 3.7 For the example data, since the pairs were collected over 1.7 five days, and since there was substantial day to day -0.5 Ln(Con-off ) variation, lognormality of the data was assessed by examination of the residuals from random effect models -1 fitted separately to the two control types. Both control- on and control-off distributions appear to be lognormal. -1.5[Ln (on) - Ln (off) ] Under bivariate lognormality, [Ln(ycon-on)-Ln(ycon-off)] -2 can be written as a linear regression on [aLn(ycon-on) + (1- a)Ln(ycon-off)] for the variable “a” in the interval [ 0, 1]. Interest here is when a=0.5 (regression on pair means) or a=0 (regression on control-off). For the regression on pair means model the slope is pslrooppeo–rt0io, nwahletroe(thFe2covn-aonr-iaFn2ccone-osff)a.reIfoFn2cotnh-oen–lFog2cons-cofaf,leth. en (The exact form of the regression, obtained by (6) transformation of variables, is: Ln (ycon-on)- Ln(ycon-off)= 3229

Joint Statistical Meetings - Section on Physical & Engineering Sciences (SPES): :( -con-on )con-off h, Other comments on Figure 4 are that for control ratio> : : B+ (.5)[ Ln (ycon-on)+Ln(ycon-off)- ( -con-on con-off)]+ 0.5, there is little difference in correlations. As the ratioDwhere correlation [Ln(ycon-on),Ln(ycon-off)]= , increases, the difference between correlations increases.h is distributed as normal(mean=0,variance=1), To compare the power of the three models a limited> F F Fwhere simulation study was carried out, as shown in Table 1. =2 ( -2 2 )/ 2 , The example 1 (Ex. 1) in the table refers to the example con-off t studied here, for which the ratio of standard deviations con-onB F F F F F= {[1 - ( was about 0.6, and D=0.85. The example 2 ( Ex. 2) -2 )2 2 /( 2 2 )] 2d}0.5, d t refers to a second example, not shown here, in which the con-on con-off ratio of standard deviations is approximately 1, and 2 D=0.8.F F F D F Fd + -22 Figure 4 Correlation between: con-on= 2 con-on con-off [Ln(on)-Ln(off)] &{Ln(off) or 0.5[Ln(on)+Ln(off)]} con-off vs Ratio[Std. Dev. Ln(on)/ Std. Dev. Ln(off)] whenF F F D F F2t= Correlation [Ln (On), Ln (Off)]=0.75 +2 2 +2 )con-on con-off con-off The simulation results shown in Table 2 confirm that the con-on 1For the regression on control-off the slope is Ln(Con-o f f )< DF F DF F=[ /con-on con-off -1]. If <con-on con-off , then the 0.8slope<0. .5[Ln(Con-on)+Ln(Con-o f f )](The exact form of the regression, obtained by (7)transformation of variables, is: 0.6Ln (ycon-on)-Ln(ycon-off)= 0.4: : < :( -con-on,L )con-off,L + [Ln(ycon-off)- )con-off,LD F+ (1- 2)0.5 con-on,L d , where d is distributed as 0.2normal(mean=0,variance=1)) 0 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5For the example data, the slope estimated from the data -0.2under the regression on pair means model was (-0.6),and under the regression on control-off model was -0.4( -0.5). The corresponding estimates based on thelognormal distribution were within 0.05 of the above -0.6estimates, when sample values were substituted forparameter values.. -0.88) Which Model to Prefer? -1 S t d. Dev. [Ln ( Con- on )] / S t d. Dev. [ Ln (Con- off )]One way to compare the two models is by a Correlationcomparison of the correlations between the dependent two regression models can yield quite different resultsand independent variables. These correlations are for statistical significance of their slopes. Setting a)plotted as in Figure 4 as functions of the correlations yields significant results for all three models, as the example data do. Setting b) indicates the effect ofD of the data on the natural log scale, and as functions standard deviation ratio ~ 1, and also the relatively lowof the ratio of log scale standard deviation of control- power for the regression on control-off model, since D ison to control-off. These calculated correlations do not large, and , therefore, D F F( -con-on )con-off is not sorequire lognormality. However, the importance of thelog-normality in the discussion is the linearity different from 0. Setting c) is a redo of setting a) , whichassociated with it (6) , and the higher the correlation the shows that when the standard deviation ratio is close togreater the tendency of the data to lie on a straight 1, and the standard deviations are much smaller than inline.The data plotted in Figure 4 are for the case that D is0.75. Similar figures result for correlations=0.5 or 0.9.For the standard deviation ratio (on/off) #1/D, theregression on control-off model has more negativecorrelation than the regression on pair means model.Since for the upper 25% phenomenon, we expect to seenegative correlation, this indicates that the regression oncontrol-off model has greater tendency to lie on anegatively sloped line than the regression on pair meansmodel. 3230

Joint Statistical Meetings - Section on Physical & Engineering Sciences (SPES)b), the power continues to be low for regression on Table 2: Power Calculations from Simulationscontrol-off. Setting d) is a redo of setting b). Since the Fractions of Samples that Yielded Statisticallycorrelation is high, the control-off standard deviation Significant Results at 5% Levelmust be reduced sufficiently so that the regression onpair means model has considerable power. Parameter i) ii) iii) Value Upper Regression Regression onAlso, the fraction yielding a statistically significant result Settings, 25% on Pair Control-Offfor regression on control-off always exceeds that for from Model Means Modelthe upper 25% model, suggesting that the regression Table 1 Modelmodel has greater power than the upper 25% model.This result holds for the different variance relationships a 0.99 1 1used. This is a result that would be expected, since theregression on control-off model uses all the data to fit a b 0.20 0.01 0.4single line. c 0.2 0.06 0.3 Table 1: Design of Simulation Study d 0.04 0.6 0.07Lognormal Simulations: 160 Samples of Size 25Parameter Values Used Response Variables for a-d:a) sample values, Ex. 1 Fraction of samples 9) Conclusions and Recommendations for which:F Fcon-on ~0.36, con-off~0.63 i) upper 25% model It is common in outdoor studies of engineering controlsD~0.85 had significantly that the reduction in concentration due to the control is different result from highest when uncontrolled measurements are highest.b) sample values, Ex. 2 lower 75% control-off pairs Results presented suggest this may be due to theF F~con-on con-off~1.5, ii)regression on pair lognormal distribution of airborne contaminant data.D~0.8 means model yielded Higher control-off values can correspond to less statistically significant environmental control; results showing greater reductionF Fc) slope (eq. (4)) at these values can mean reduction due to control is~ ,con-on iii)regression on greatest when environmental control is least. Another con-off control-off model possible explanation is the effect of backgroundFusing ,con-on from Ex. 1, yielded statistically concentrations, which will be investigated in futureF Dcon-on~0.36, ~0.85 significant slope. research. (eq. (5)) Three models were compared: the upper 25% model,Fd)In Ex. 2, con-off modified the regression on pair means model, and the regressionD Fso that ( -con-on on control-off model. For standard deviation Fcon- ratio( control-on / control- off)oDff~)~00.8, #1/{correlation[Ln(ycon-on),Ln(ycon-off)]}, the regression on control-off model yields greater negative correlation than the regression on pair means model. Also, the upper 25% model does not have as much power as the regression on control-off model. Thus, the regression on control-off model can give a better estimate of how well the control system is functioning. 10) Acknowledgments The authors wish to thank James Deddens, Michael Gressel, Edward Krieg, Jr., and Paul Schlecht of NIOSH for their helpful reviews of this document. The recognition of the upper 25% phenomenon is based on work of former NIOSH employees Dennis O’Brien and Thomas Fischbach, and we appreciate Dennis 3231

Joint Statistical Meetings - Section on Physical & Engineering Sciences (SPES)O’Brien’s suggestion that we consider its applicability tothe asphalt data. The authors appreciate helpfulcomments by Steve Simon of Children’s MercyHospital, Kansas City, and Michael Butterworth of CBS,both of whom viewed the poster session at the JointStatistical Meetings.11) References1) Scheffe, H. The Analysis of Variance, Wiley, 1959,p. 129.2) Mandel, J. A Method for Fitting Empirical surfaces toPhysical and Chemical Data,” Technometrics, 1969, V.11, pp 411- 429.3) Mandel, J. “The Partitioning of Interaction inAnalysis of Variance,” Journal of Research of theNational Bureau of Standards- B Mathematical Sciences,V. 738, 1969, pp. 309-328.4) Draper, N. and Smith, H. Applied RegressionAnalysis, 2nd Ed., Wiley, 1981, p. 122.5) Rappaport, S.M. Assessment of Long-TermExposures to Toxic Substances in Air. Annals ofOccupational Hygiene, Vol 15, 1991, pp.61-121.6) Anderson, T.W. An Introduction to MultivariateStatistical Analysis. Wiley, 1958, p.30. 3232

# means of controlled to uncontrolled work environments

##
**Description: ** b), the power continues to be low for regression on control-off. Setting d) is a redo of setting b). Since the correlation is high, the control-off standard deviation

### Read the Text Version

No Text Content!

- 1 - 6

Pages: