Measurement Techniques, VoL 39, No. 6, 1996
APPROXIMATE
ESTIMATION OF THE CORRELATION
COEFFICIENT
N. N. Radaev
UDC 519.21
Methodology for calculational-experimental estimation of the correlation coefficient between random variables bused on the information about the contribution of various groups of random factors in their variance is proposed. An analytic expression of an approximate estimate of the correlation coefficient is obtained.
A complex engineering system consists of a large number of interconnected elements; thus, its reliability cannot be reduced to the reliability of its individual elements. In the general case, the random variables of the lifetimes until a failure of the elements T/(i = l . . . . . m), and hence the events of their failures during the lifetimes t o given by the condition T i < to are interdependent. The basic types of dependencies between the failures of elements are the functional dependence (between the states of the elements) also known as deterministic dependence and the stochastic dependence [1]. The dependence between the states consists of the dependence of the reliability of one element on the state (for example, of an electric or thermal mode of operation) in which the other element is situated, and this state also changes in the course of the operation. This dependence is due to the presence of electric and other connections between the elements [2]. To take them into account when carrying out non-interacting trials of an element, one should simulate the variation - in the operational process - in the states of the elements associated with a given element. The problem of allocation of independent elements according to their states is solved in the course of the composition of the system. Below, it is assumed that this problem has been solved, however, stochastic dependencies cannot be eliminated by means of decomposition. Let the reliability of a system be described by a random vector of lifetimes until the failure of the elements T = (T t, .... Tra) with the joint probability density f(t 1. . . . . tin). We shall assume that the elements of the system are sequentially interconnected from the reliability aspect. If the components of the vector T are mutually independent, i.e., J(t I . . . . . tm) = rl
fit/), then the independence of the random variables implies their uncorrelatedness. Thus, in accordance with the
t-1
Chebyshev inequality m
~'-'
m
R(to)= J f(t..... t,)dt~_FIlf(t,)dt=lTR,(to) ' r, ~:to
,,~t o
,.t
where R(to), Ri(to) are the probabilities of the reliable operation of the system and of the i-th element respectively. The assumption that the failures of the elements are independent is often made when estimating and controlling the reliability of a system. In particular, this is assumed when designing experiments to verify the reliability of a system based on non-interacting trials of their elements [3]. The above stated inequality shows that an estimate of reliability of a system under the assumption of independence of the reliability of elements provides a lower bound on the probability of reliable operation (PRO) of a system, i.e., results in a systematic error when estimating the PRO and, correspondingly, in an underestimation of the certainty of the decision taken concerning its reliability based on the data of non-interacting trials. When the random variables are normally distributed, the stochastic dependence is characterized by the correlation coefficient, which can be estimated based on sample data [4]. However, at the stage of processing an experimental specimen, no such data is available.
Translated from Izmeritel'naya Tekhnika, No. 6, pp. 15-17, June, 1996. 594
0543-1972/96/3906-0594515.00
01996 Plenum Publishing Corporation
We now pose the problem of obtaining an approximate relationship for estimation of the correlation coefficient by means of a computational method based on theoretical investigations and experimental data concerning the effect of various factors on the spread o f lifetimes till the failures of elements. Solution of the P r o b l e m . The special features of the deterioration process of parameters resulting in the spread A = T - M I T t of lifetimes until the failure depend on the source of fluctuations. In accordance with [5] these sources can be subdivided into two groups: - internal, which belong to elements of the system (the spread of constructive - technological parameters which determine their initial quality); - external effects, which influence the rate of deterioration of parameters of the elements of a system in the process of their operation. Correspondingly, the spread of times until a failure can be represented as the sum of centered random variables A l and A2 representing the spread of lifetimes due to the spread of constructive - technological parameters and the operational conditions (corresponding external effects) respectively. We note that the components of the spread can be subject to further subdivisions for the purpose of extracting components due to the action of definite random factors. Evidently, the stochastic dependence of lifetimes until the failure of elements is a manifestation of the action of general random factors on these elements. The information about their contribution to the overall spread can be used to estimate the correlation coefficient between the random variables. We shall consider two arbitrary elements of a system and determine the coefficient of mutual correlation of their lifetimes until a failure. Let t~, = 6 + LL.and X = .,~., + 4
(1)
be the spreads of lifetimes until the failure of the first and the second elements respectively. Here All and A21 are independent components of the spreads and ~ , , = c,6 o, ,.Xn = c2~ o,
(2)
where c t and c 2 are constants, Ao is a random variable with a unit variance ( D [ A o ] = 1), are dependent components of the spreads of lifetimes due to the action of the same (common) factors to which the elements are subject. Their variance is calculated by the formulas D[,,X,21 = o.z = c,2, D[-'.'X=I = = n z =
cn.=
(3)
Let, moreover, _x12= ; ( x , a , = .-6 2"
(4)
O[,_L,I= d, Dt'x,I, D['-X~I= ,~Di~X~I.
(5)
Hence
where e t = ot21a t, ez = a22/a2 are fractions of the spread of lifetimes till the failure of the first and the second elements due to the common factors, a t = D I A l ] i n , cr2 = D[A2]l/2 are the mean square deviations of the lifetimes of the first and the second elements. Then, taking (3) and (5) into account, we have
C I = ~1~1 , C 2 = ~ .
(6)
By definition, the correlation coefficient of lifetimes is
cov(a,a2) P12 =
~
(7)
595
where cov(AtA 2) = M(AIA2) is the covariance of the lifetimes till the failure of the f'trst and the second elements (of the random variables A t and A2). Taking (1) mto account, we have
M[,_v,,] =
+ ,,,,x,,,,
>] = M[,,,,,,,, l + M[,,,,,,,,] +
+ M[-~,, &:] M[A,, &=] = COy(&,, A,,) +covfAnA~>
+
+cov(~. ~ : ) + cov(&,z &=) Taking (2) into account, we obtain M[A1A2] = COV(Atla2l) + ctcov(AoA2t) + c2cov(AtlAo) + ctc2cov(AoA0). Since the random variables All, A21 and A0 are pairwise independent, the first three summands vanish and cov(AoAo) = D[Ao] = 1. Hence (7) is reduced to ~')~2 = ClC2 [ ( [ ~ , 2,
2 2
9
2
2
. . +c :o)(~:,. +. c:o)]
2
We replace the constraint (4) by an equivalent one:
-% : (I -- r. )~,,. ,.I:, = (I -- ~:~)6;
or
,=,', = ( ~ - ~., :,:,. ,=:, = (1- ~., ),,~;. Whence, taking (6) into account, we obtain that P,z = ~,~2cr, o', {[(1 - c, )'o'~ + r.zto'~] [(1 - ~:, )2o-~ + c~o'~]}"' or
p,;=,,,.,{[1-2,,(1-,01!1_
2,,(1-
~.,)]}-"
(8)
The quantities e 1 and e 2 are ratios whose precision is higher than that of the quantities comprising it. Thus e 1 and e 2 can be estimated either computationally or based on an experimental data. For estimation of the individual components of spreads, the results presented in [6] can be utilized. Stochastic dependencies usually involve dependencies related to the mode of operation, initial effects, and parameters [I]. Among the lifetimes until the failure of the elements operating in the system there exists dependence of the initial effects due to the action of the external factors on the two elements from a single source. The external actions on the elements of the system in the real-world operational conditions are described by the vector Q with joint probability densityJ(q). The components of the vector Q are stochastically dependent. In the non-interacting trials BI
stochastic dependencies are not reproduced, i.e.,
f(q) = 1-IJ~(q)- Therefore, variances of lifetimes till a failure due to the i=1
spread of external actions in real-world operational conditions under complex trials are stochastically dependent, while under non-interacting trials turn out to be independent. This leads to the differences between the trial conditions and the real operational conditions, and the measure of these differences is the correlation coefficient due to the dependencies between the initial effects. It can be estimated by means of (8) taking e t and e 2 to be the fractions of the spreads in the lifetimes until failures of the elements due to the spread of the initial effects. When planning and processing data of non-interacting trials, to eliminate the systematic component of the estimation error, this dependence ought to be taken into account. There exists a stochastic dependence - due to the common mode of operation - between the lifetimes until the failure of the elements of a system.
596
Such a dependence exists in the elements operating from a single feeding source (in an electric mode of operation), located in the same casing (in a thermal mode) and so on, whose specific fluctuations (modes) are not reproduced in the course of non-interacting trials. Let e I and e2 be the fractions of spreads in the lifetimes until failures due to the instability of the modes of operation (trials). Then the quantity P12 can be estimated by the formula (8) and be utilized when recalculating the data of non-interacting trials for the real-world operational conditions of the whole system. The dependence between the lifetimes until a failure of elements along the parameters exists in particular for the elements consisting of integral schemes constructed using the same technology. In a stationary technological process, a rigid correlation between electrophysical parameters of the integral systems is observed (the correlation coefficient reaches there 0.80.9 [2, 7]). When conducting non-interacting trials, the dependence along the parameters is displayed; however, it is not taken into account when designing and processing the data obtained in the course of the trials (failures of the elements are assumed to be independent). At the same time, taking this dependence into account in the presence of the corresponding prior information provides additional information for the decision making. Statistical Simulation. The obtained relationship can be checked by means of the method of statistical simulation. Simulating on a computer, one could obtain N pairs of realizations of the non-failure operating time which can be used to determine the sample correlation coefficient (9)
{ ~ = S.: ,' S,S:
where S,.=--.(~ y,y, -Ny,y:), "
N
,",
'
"
Sz,
=
."
*
I
=-:7. / ~ f =El Y ~ .i - Y~,
N
Y'" - - - N,=/" ~
I
~"
.Y2
andYli and Y2i (i = 1 . . . . , N) are the realizations of the sums of the random variables A n + A12 and A21 + A22 respectively. When determining the coefficient of the mutual correlation Pt2 by means of (9), the quantities Yli and Y2i represent realizations of the random variables A 1 and A 2 whose components At2 and A22 in accordance with (2) ought to be simulated utilizing the same random numbers. Here the constants c t and c 2 are obtained in accordance with (6) from the given values a t, a 2, e l, e 2. Simulation conducted by the method of statistical trials and the calculations carried out by means of (8) for the case e t = e 2 = e have shown that the relationships p12(e) obtained using both methods coincide. Thus, the methodology of approximate estimation of the correlation coefficient is as follows: a theoretical or experimental estimation of the spreads of lifetimes until a failure of elements of a system; - a theoretical or experimental estimation of the components of the spreads depending on the action of the common factors on the elements; - c~culation for each element of the fraction of the spread of lifetimes due to the action of the common factors; - calculation by means of (8) of the correlation coefficient of the lifetimes until failures. The obtained relationship can be utilized in engineering calculations for an approximate estimation of a positive correlation between the random variables of an arbitrary physical nature. -
REFERENCES .
2. 3. 4.
B. V. Gnedenko (ed.), Problems of the Mathematical Reliability Theory [in Russian], Radio i svyaz', Moscow (1983). Yu. N. Belyakov, F. A. Kurmaev, and B. V. Batalov, Methods of Statistical Calculation of Microsystems on a Computer [in Russian], Radio i svyaz', Moscow (1985). I. V. Pavlov, Statistical Methods of Estimating Reliability of Complex Systems Based on Trial Results [in Russian], Radio i svyaz', Moscow (1982). L. S. Zazhigaev, A. A. Kish'yan, and Yu. I. Romanikov, Methods of Design and Processing of Results of Physical Experiments [in Russian], Atomizdat, Moscow (1978).
597
.
6. 7.
598
N. G. van Kampen, Stochastic Processes in Physics and Chemistry, Elsevier North - Holland-Amsterdam-New York (1981); second edn. (1992). N. N. Radaev, Izmer. Tekh., No. 11, 38 (1995). G. A. Keidzhan, Foundations of Securing of the Quality of Microelectronic Equipment [in Russian], Radio i svyaz', Moscow (1991).