NOVEL APPROACH
IN R I C H D A T A H A N D L I N G
G. A. OSOSKOV" Laboratory of Computing Techniques and Automation, Joint Institute for Nuclear Research Dubna, 141980 Russia email: ososkov0cv, j inr. ru
On the basis of the mathematical model of the RICH data the main stages of data processing are considered. On the first stage of measurement clustering in order to obtain photon hits two new methods of hit accuracy improving are described: numerical approximation method and the wavelet analysis method. "['he central part of the paper is focused on the approach of direct processing of raw RICH data, which is especially actual for the high granularity RICH detectors like CERES and COMPASS. This raw data approach is based on applying of the robust methods with special weight functions on both steps of the further data analysis: for Cherenkov ring recognition and for particle identification. Testing results on simulated data are given.
1.
Introduction
The Ring Imaging CHerenkov (RICH) detectors are widely used in experimental high energy physics as a part of particle identifying (PID) systems of many experiments (see, for example, [1-5] and the survey [6]. Despite of difference between RICtI designs most of them consist of three main parts: radiator, detector and readout part. The Cherenkov photons from the RICH radiator are focused onto a detector. The information of the detector is read out via two-dimensional arrays of many thousand photosensitive sells (pads) each. During an event a number of Cherenkov photons produced by a detected particle fall on the pad plane and form a ring (or ellipse). The radius of this ring is determined by the particle type (mass) and its momentum. Therefore, as soon as particle momentum is evaluated from non-RICH trajectory measurements, an estimation of its radius can be used for PID. However, such an estimation could not be calculated directly by fitting a circle to a set of photon hits due to many experimental factors. The first is that photon hits are not observed directly. In fact, each photon, while registered by the RICH, produces signals in a cluster of adjacent pads near the pad that it hits, see Fig. 1 and 2. The energy distribution between these pads is symmetrical, bell-shaped, in particular, a gaussian is the most often used approximant due to many physical reasons. Moreover signals measured at each pad (amplitudes) are distorted by cutting off both: too weak and too strong signals. Besides an additive noise is added to each amplitude in the process of its measurement. The second ") Supported by RFBR. grant N 97-01-01027, Russia. Czechoslovak Journal of Physics, Vol. 49 (1999), Suppl. $2
145
G. A. Ososkov
thctor arises due to the multiplicity of events, winch leads to ring overlapping, and data contamination by background photons and electronic noise (see an example of the RICH1 CERES event in Fig. 1). The statistical character of all these processes must be also taken in account.
Fig. 1. CERES RICttl image of Au-Pb event
Fig. 2. 3D image of simulated discretized signals of one Cherenkov ring
Statistical dependencies of measured values of amplitudes on Cherenkov ring parameters in question have extremely sophisticated, indirect character (it can be seen, ill particular, from the detailed probabilistic model of RICH data given below). Therefore the traditional way of RICH data handling is inevitably oriented on such step-by-step data processing, which should on each step simplify data and their dependence on wanted parameters. On the first pads-to-hits step centers of photon hits are estimated by clustering of adjacent pads with non-zero amplitudes, following next by the center of gravity (COG) procedure. In the often cases of two overlapping clusters easy chopping-int w o is made along the watershed line between two peaks when they are pronounced enough (if two superposed peaks produced an unimodal surface, it gives the only one. although shifted, COG center). The next step of ring recognition is. ire fact. needed when the off-line search of ring-candidates has to be performed without prior knowledge of the Cherenkov ring centers, as for the CERES'94-96 experiments. Quite time-consuming technique is used. It is based on the ttough-transform of hits to the parameter space (see, for example [7]). ttowever for the majority of experimental setups, containing RICII detectors, ring centers and particle momenta are known with some degree of accuracy from non-RICH trajectory measurements. One more step is needed to fit a circle to each ring-candidate and to clean data from background hits as well as from hits of close or overlapped rings. It is necessary to obtain the most accurate estimations of the ring center and radius Czech. 3. Phys. 49/$2 (1999)
146
Novel Approach in RICH Data tlandtit2g
required for the following particle identification procedure. Quite an advanced technique was elaborated for ring fitting effective in conditions of heavy background and presence of close overlapped rings [8]. It. is based on the robust, methods of iterational weighted least square fit with optimal weight functions. This technique has proven its effectiveness for CERES data, but it is also useful for experiments provided by track detectors, since it allows to improve the accuracy of Cherenkov ring parameters. The final step of PID procedure is presented by several methods (see, e.g., refs. [3-5]). Some of these methods are based on the maximum likelihood approach [3], or on comparing probability distributions [4] of Cherenkov photons for different hypotheses. Other methods, like ref. [5], use the pad information, but. only to count the number of pads in fiducial areas calculated for alternative particles. All PID methods demand the knowledge of Cherenkov ring centers and corresponding particle momenta with a high level of accuracy and, therefore, assume an extended preliminary preprocessing of the bulky array of many thousand pads from which the raw RICH data are consisting. This paper is motivated by the following considerations: (i) There is a need for a substantial speeding up of processing of RICH measurements and increasing their accuracy. (ii) This can be achieved, in particular, by improving the quality of such an inaccurate and time-consuming step as locating photon hits in a sea of pads or by eliminating this step at all in order to handle pad information on the next steps. (iii) Successful application of robust statistical methods in our previous work [8] inspired us to apply those methods to the raw RICH data. It allows to compute Cherenkov ring parameters with a better accuracy and improve the particle identification (PID) procedure by choosing the most likely radius corresponding to a pad sample surrounding a Cherenkov ring center. Therefore after presenting the mathematical model of the RICH detector data in sect.2 two groups of novell methods are expounded further. In sect.3 numerical methods intended to improve the accuracy and facilitate pad-to-hit procedures are explained. Then in sect.4 robust methods are described for both Cherenkov ring parameter and PID evaluation from raw RICH data. Numerical tests show considerable improvement in the accuracy over traditional algorithms. For the sake of brevity we do not discuss here other interesting but more time consuming global methods of RICH raw data processing such as the MCMC (Markov chain Monte-Carlo) approach [14] or elasic neural networks [7]. 2.
M a t h m o d e l o f the R I C H d a t a
Simplih, ing we can give the following probabilistic model of RICH data. Let us now explore the case of the data created by M Ch-erenkov photon rings together Czech. J. Phys. 49/$2 (1999)
147
G . A . Ososkov
with a set of background photons. We assume that the rings can have different centers and radii cj, R~, j = 1. . . . . 3I. Each center c = (c~, cy) have two-dimensional uniform density discretized to the readout grid. The distribution f ( R ) of radii is a mixture of normal distributions, each concentrated around one of a few expected radii values Rt with corresponding covariances c~N[. Naturally. M is not known to an analyst and is also a subject of estimation. Number K of Cherenkov photons emitted by one particle collision is distributed according to the Poisson law P;,(K) with the mean ,k. Denoting by (z, Y) a point (pad) hit by a photon (either a Cherenkov or a noisy one), we assume that Cherenkov photons of the same ring are mutually independent (given K, c, R), and that each hit position is given (in radial coordinates around center e) by an angle ~: and radius r. Angle ~p is distributed uniformly in (0, 2rr), r ,-, A/'(R, gn). "~ Then, for (qo.r) = {(~k, rk), k = 1 . . . . . Ix'},
f(~,rlK, R)= yi
f(rklR).~-~
,
f(rIR)=
2a h
y~-~-----~'exp-
/
(1)
k----1
and corresponding (z, Y) co,ordinates are xk = c, + rk cos ~ak,
(2)
Yk = cy + rk sin c2k.
D
We denote by E the mean energy of one photon, by A ~ an energy observed at cell (u, v), so that A = {A~,} are the amplitude registered by the detector. It is normally distributed A"(A~, ~r~). Its mean A~v depends on position o f t , on R, K, M. on locations of hit points (x, y). The standard deviation CrA is common for amplitudes in the given cluster. Its value in tile field of measurements is supposed to be known, fA (e) is the distribution of photon's energy, which can be different for different detectors. In particular, it can be either exponential fA (e) = e z p ( - e / E ) / E or Landau. The latter one can be approximated as e z p ( - 1 / 2 ( z + e z p ( - x ) ) ) , where x = (e - E)/k, k is the scale parameter. Simpler approximation looks as "asymmetric" gaussian (two halves of gaussians with the same m a x i m u m but, different cr for the left. and right halves). An expected contribution of a photon hitting the cell (z, y) to the energy observed at cell (u, v) is given by E . p[(u, v). (x, y)], where
v),
y)] =
1 exp(
]
(3)
Therefore, tile total expected energy at (u, v) is the sum of expected contributions from Cherenkov photons and from background photons, namely m --
--
/~
/"
E
.4,,,. = t ! . A . J ~ J y p [ ( u , v ) . (x,y)] . f[(x, glc, R ) d x d y + ~
Czech. J. Phys. 49/$2 (1999)
.p,
(4) 148
Novel Approach in RICtt Data Handling
where fi" is derived from distributions of v and io, i.e. from (1) and (2), p is the mean value of background photons, which distribution is supposed t.o be uniform in the detector area. Finally, as it was mentioned above the distribution of actually observed energy Au~. at cell (u, r) is Af(Am.,~r~). These distributions (at different pads) are conditionally mutually independent, given the values A ~ . i.e. the dependence of energies at neighboring cells is described through the dependence of expected energies A~, in (4). Thus, for each given e and R we are able to compute the probability p(A:e, R) of observed data A = {Auv, (u,v) E ,~}. This probability distribution depends naturally on a set of parameters. We assume that, these parameters, namely E, A, p, ~rR, tra, tTa are known, from physical background of the experiment.
Remark: All normal distributions should be (more realistically) taken as trimmed normal, either symmetrically (e.g. two-dimensional density p) or with a nonsymmetrical threshold (e.g. energy A is observed only between some Amin _> 0 and Amaz given by the detector limitations). The developed model is then used for the inference of data processing algorithms and Monte Carlo simulations.
3.
Hit accuracy improving
The problem of the most exact determination of the observed signal position from noisy experimental data distorted by a measuring device is typical for data processing of not only RICH detectors, but in almost any field of experimental physics. As pointed above, 2D gaussian (3) is the most often used approximant for symmetric, bell-shaped signals. The problem is to evaluate parameters of such a discrete signal, i.e. its position, amplitude and also its half.width in presence of noise, detector uncertainties and influence of other close signals. Due to the factorized view of (3) the problem usually is reduced to handle several iD-gaussians:
g(z;A, zo) = Aexp
2a.~
] .
(5)
In such 1D presentation (5) a doublet of two overlapping signals can be approximated as
G(z;A, xl;B,~) = Aexp (
(x-zt)2~--~. ,]+Bexp(
2, 2
] .
(6)
In the process of registration a signal is to be discret!zed as a histogram {ak) on the interval (Zbeg,re,d), Czech. J. Phys. 49/S2 (1999)
149
G. A. Ososkov
~k
G(x: A, Xl: B, x2) dx,
ak = -
7"
(7)
where xk = xbeg + kr, r - is the bin width. Besides an electronic noise gives a contribution to each histogram bin. Noise values are. in principle, correlated, but according to the earlier study [9] we treat them as independent normal variables with rms equal up to 10% of the mean am plit ude. We apply two methods for solving our problem: numerical one based on fitting a gaussian to the discretized and distorted signal and the spectral method based on wavelet analysis of this signal. 3.1.
Numerical approximations
Let us have the histogram {ai}, i = "1, n with the unit binsize, We have to fit (5) to this histogram. The iterative solution of the corresponding least square functional
[.(A, xo) = E
ai i
Ix'+1g(a; ,t, xo)dr 12
(8)
.Jx~
is elaborated in [9] oil the basis of approximating (8) as a function of two parameters (A,x0) by an elliptic paraboloid. As initial values A ~ x0~ of unknown parameters for this procedure in a single signal case we use A (~ = maxe, ai and its position or the histogram center of gravity. Further we refer to this method as to Paraboloidal Approximation Method (PAM). In the case of two superposing peaks tile expression (6) depends on four parameters. To find them we have to minimize a fimctional generalizing tim view of (8) s
ai -
= i
G(z;AI,Z
_
. )dx
,
(9)
JXl
where the element of integration is taken from (6). It was done in [9] by direct generalization of the PAM procedure on four-parameter functional (9). We use the above solutions to obtain k estimations (Aj, xo,j) for tile each padrow Y.i, J = 1, k. The wanted estimation of the centroid, i.e. of the position of the signal (x0, Y0) can be calculated as the center of gravity of these k estimations.
xo = ( E A J x ~ J Czech. J. Phys. 4 9 / $ 2 (I999)
Aj); J
yo__-- ( E f l j y j ) / ( ~ ,4j) J J
(io) 150
Novel Approach in RICH Data ltandling
To obtain its amplitude A we minimize the functional = Z J where the function
p[(u, v),
2
(xo, Yo)] is taken from (3). That gives
(~ - t o p
(~,-yo) =
['
,(=e,'-=o)
= (i,,-=,o)2,'~
(11)
The efficiency of the doublet resolution, as it was obtained in [9], is on the level of 93cA,, if the distance between two signals is greater than 1.5a, while the error of this distance estimation is decreasing linearly with the signal half-width o" as 0.6333 - 0.0555a.
3.2. Wavelet analysis "Well-known restrictionsof Fourier analysismotivated our interestto such a modern signal analysis mean as wavelet-transfom [10]. According to the gaussianlike shape of our signals it is natural to choose, as a basic wavelets, the family of vanishing m o m e n t u m wavelets (VMW), since they are generated by gaussian distribution function:
#n(z)
(
=
_i~+I dn ^-==IS
,
dz----~
.
,
n >
0
(12)
Two first of VMW are most known [10]: g l ( X ) = --xe
g2(X) = (i - x:Z)e- ~ .
,
(the second one is also known as "the Mexican hat") Nevertheless, we found that the power of the wavelet-analysis can be really extended, if we would use the higher order V M W , in particular, ga(x) ---- (3x - xa)e -m~,
g4(x) ---- (6x 2 - x 4 - 3 ) e - ~ .
The normalizing coefficientsof these wavelets Cg, are 2~r(n-l)l. The most important V M W property consists in preserving their relativesquare, which we define as [11]
Is.(x)l dx w(=) =
~r ~
(13)
f0 The first four of VMWs are shown in Fig. 3.2. As we have checked, the relative VMW squares calculated for the first ten of VMWs are almost equal forming a specific narrow plait. One can see that clearly in Fig.4 for the first four VMWs. Czech. J. Phys. 49/S2 (1999)
151
r A. Ososkov
w.(x) 0,8
,' ',, /-4' ;\ ', ~'~ ~ l
.\
0,6 0,4 0,2 0o
1
2
3
Fig. 4. Their relative squares
Fig. 3. First four of VMWs
Tlle rule to determine the optimal dilation parameter a can be derived from tile expression (13) for the invariant relative square u' v/// -1 a= ~ 2In(l-w)'
t,=x2-xl,
O
(i4)
It is a remarkable fact that the wavelet transformation of a gaussian (5) looks as the corresponding wavelet. Therefore the general expression for the n-th wavelet coefficient has the following form:
1)!s *lgn where we denote s = y'a 2 + e ~. To present the wavelet coefficients of a doublet (6) of two gaussians we use the simplifying normalization:
Wn(a,b)g- Wg,(a,b)9 wg,
wg, =
Ao.an+l/2 x/(n - 1)!s-+l
Then we obtain
S i n g l e g a u s s i a n signal. For the single gaussian signal we can calculate wavelet transform in a few points and solve the system of corresponding equations. However, applying the ratio of different wavelets we call eliminate the exponent e \ 2(. . . . . )] and obtain the signal position explicitly. For instance, the ratio I.I~3(a, b)g/H~,(a, b)g gives x0=b+ Czech. J. Phys. 49/$2 (1999)
(a~+~r "~) 3 + v'2(a2+~r")Wg3(a'b)g" a2 ~I.~, (a, b)g
(16) 152
Novel Approach in RICH Data Handling
The true sign in (i6) is easy to choose when one would calculate the coefficients I$)~ (a. b)g and Wg~(a, b)g in a point, which .is far enough from the supposed signal position. The amplitude value can be evaluated via the value of the half-width of the signal er (if known) and the expression Wg~(a, a:0)g = Ao'aSl~/s a. But in the case when the value of o" is unknown it can be also evaluated using l&'ga (a, a:o)g/Wg, (a, x0)g:
cr2=-a 2 i+ ,~Wga(a'~:~ V 2Wg,(a, zo)g]'
(17)
D o u b l e t o f close g a u s s i a n s . For a doublet of two close signals we can use either 9 four first wavelets calculated in one point (method WTS - Wavelet Transform System) 9 or one of those wavelets (we choose g2) calculated in four different points (method g.,-WTS). The corresponding systems of equations are: for WTS :
F,~"-Wn(a,b)G-W,~(a,b)h-O,
forg2-WTS:F,~=W2(a, bn)G-W2(a,b,~)h=O,
n - 1,2,3,4; n = 1,2,3,4;
where n-th wavelet coefficient of a histogram h
W (a,b)h=
1
,hk
gn
d=
k=i
is calculated from the source histogram only one time for all iterations~ g~ in g2WTS method are taken in four followingpoints: bl = b-h~2, b2 = b+h/2, b3 = bi-h, b 4 = b 2 + h w i t h s p e c i a l l y c h o s e n h. Newton's method is applied to solve these non-linear systems D A X : F, X : (xl, x2, A, B) T, F = (Fz, Fa, F3, F4) T, where D is the matrix of partial derivatives of Fn with respect to components of X. The first approach X (~ is obtained by a rough estimation of the signal parameters from the source histogram h (see, for example, [9]). The next approach is obtained as X (1).-- X (~ + A X and so on iteratively, unless a wanted accuracy is reached. The amazing insensitivity of wavelets to various signal distortions is widely known (see, for example, D.Donoho's article in [10]). However it was necessary to test more in details the accuracy and efficiency of proposed methods and the dependencies of their results on such factors as (i) distance d = I x 2 - x l I/a between two components of the doublet signal (6), (ii) noise level and bin shedding, (iii) Czech. J. Phys. 49/$2 (1999)
153
G. ,4. Ososkm"
detector granularity degree. Results of Monte-Carlo simulations done in [13] to fulfil this study can be summarized as follows. RMS-error dependence of the signal position estimation on the distance between two signal components determined by WTS and g.~-WTS methods are shown in Fig. 5, where distances are given in the signal half-width units common for both components. Error values are given in bin-width. The first method has better accuracy when the distance between two peaks in a doublet is less than 2or, although when it, approaches lg-distance, the accuracy increases considerably for both methods. Relative RMS-error of the estimate 0"9 of the single gaussian half-width grows linearly with random nois e (in percents of ,-1) as 0"9 = 0.2o',~oise + 0.01. Dependence of ~g on detector granularity is presented in Fig. 6.
t0 1
0,04I"\
~.~_\~../ g2-WTS 7"\\
o,1 O,Of 0,001
WTS
x I
t,5
\. I 2
1
2,5 d, o-
Fig. 5. RMS-error dependencies of signal position estimates on the distance between two signal components determined by WTS and g2-WTS methods.
o,
"
010
20
30
40
N
Fig. 6. RMS-error dependencies of the single gaussian half-width estimate on detector granularity (bin number)
A comparison of acuracies of the wavelet analysis and the Fourier approach, which is more familiar for experimentalists, is fulfilled in [12] for the problem of resolving doublets of close gaussian discretized signals. As it was shown there in a simplified condition, when the position of one of signals is fixed, the rms of the wavelet estimation of the distance between both peaks is 20% better than of the Fourier method. 4.
Raw data processing
During the hit-extracting procedure, i.e. calculating of the hit centers of photons from the raw data, one loses the detailed information of individual pad amplitudes contained in the raw data. Besides. these procedures are, as usual, time-consuming and inaccurate in the cases of close, overlapping clusters. Therefore the idea of ring fitting directly from the raw data can potentially result in faster and more accurate algorithms. In addition, two important features of RICH detectors must be taken into account: Czech. J. Phys. 49/$2 (1999)
154
Novel Approach in R I C t t Data tlandling
i) the energy measured in the pads is discretized and signals both too small and too large are cut off: ii) the high occupancy of RICH detectors - the abundance of background noise and possible presence of two or more overlapping rings. These RICH data features considerably violate the crucial assumptions of normality and independence of errors on which the classical least square fit (LSF) is based. It is assmned that measured point deviations from the observed ring are normally distributed independent variables with the common probability density function (p.d.f.) p(d) = (2zc~r~')-l/2ezp(-d2/2). However, these violations often cause a complete breakdown of the LSF, since data are contaminated by background measurements with p.d.f, u(d) and, therefore p(d) = ( 1 - e)g(d)+ e , u(cl), where 9(d) = (2r~r2)-l/~'ezp(-d2/2) and e is the background rate. In such cases the robust weighted least square procedure ~ w(di)d ~ -.+ max works better. The optimal weight function w(d) for the uniform background distribution u(d) = const was derived in our earlier papers [8,15]
c u,(x) = ~ - 2
u,(x)) -1
(is)
1 + 1 - ~ ' a(x)
As it follows from circle fitting results of [8], this weight function gives better accuracy than the traditional least square methods and, therefore, is just suitable for processing data of standard detectors, with low granularity, where each photon is registered by one pad.
r
26
0.1
.........
" "-"
~ "Z.
". " "
"" . I
I
i
9 N
o~ 5
10
IK
20
26
30
~l~
Fig. 7. Simulated image of two overlapping rings with 100% of noise. Darker pads relate to higher amplitudes. Circles indicate fitting result.
Czech. J. Phys. 49/$2 (1999)
0
|
J
40
% Fig. 8. A bihorn 2D weight function piecewise linearly approximated
155
G. A. Ososkor 4.!.
C]erenkov riugreconstruction
Ttle main problem for lhe high granularity detectors considered in this paper is to take into account such an essential information a~s signal amplitudes measured in each pad (see a 3D-image of simulated ring in Fig.2). Corresponding formulae for the optimal weight functions have been derived in ref. [15] by the maxinmm likelihood approach. The formula was obtained for a more general track-finding problem, when a track passing through a detector hits one of many parallel coordinate planes (pad rows}. Each hit results in energy deposition m neighboring pads. The total amount of the released energy, E, is a random variable with some p.d.f, f ( E ) , 0 < E < oc. The spatial distribution of the deposited energy among pads is bell-shaped, say. a Gaussian, centered at the actual hit point, a with a constant variance o'2. The calculation gave
w(d; A) = g f(gA) + g2A f'(gA) g f(gA) -e u(A) '
(19)
where g = ~ . e~/2ao~, u(A) is p.d.f, of a background amplitude A measured in the given pad. If the distribution f ( E ) is specified as an exponential one with the mean E0, i.e. f ( E ) = Eo 1 9e -E/E~ E > 0, then for a wide choice of the background component u(d) (uniform or Gaussian with a wide ~r,~oi,e or even an exponential one with a large mean value in the cases of &electrons) we obtain quite an interesting 2Dweight "surface", see Fig. 8. For weak signals with a small amplitudeA the weight as a function of the distance d has a "hi-horn" shape with two pronounced peaks merging together when A is growing. In a track-finding problem, this bimodal weight function is quite natural, it indicates that weaker signals are less likely to appear in the closest vicinity of the track line. Stronger signals are most likely to appear right on the track line, hence the weight function has a narrow single peak. Another problem appears in some experiments when in a certain pad row all the signals are very weak. They will be all rejected by the bimodal weight function, even though within that row the signals may have a nice bell-shaped distribution with a local peak right on the track. In this case one can apply a prior normalization of signals in each pad row in a narrow corridor around the expected track line, see [16]. Lastly, in track-finding applications of bimodal weights to STAR T P C data, we approximated nonlinear optimal weights by simple piece-wise linear functions
[16] (see Fig. S). However, direct application of (19) to the circle fitting using the pad raw data runs into two obstacles: (i) it is very likely that some pads with a small amplitude appear right on the fitted circle (see Fig. 7. for example); (ii) the substantial variability of photon energies distributed according to f ( z ) . Czech. J. Phys. 49/S2 (1999)
156
Novel Approach in RICtl Data Handling
Both problems are solved in ref. [15] by integrating away hidden parameters. The optimal weights have bimodal shape, but their efficiency is lower due to disregarding some weak signals in areas between clusters. One more circle fitting problem relates to the frequent case of several crossing circles. If one would fit them one by one. an extra data contamination appears due to the influence of points of the "stranger" circle in the crossing area (see Fig.7). So it would be more reasonable to fit those circles simultaneously. For solving the problem of simultaneous fitting of two or more circles it is necessary to create a single equation for several circles.
[ P/
rmso,b rmsR
LSF
I 0.76
0.231
0.420
Huber
[ 0.17
0.252
0.321
'Ilikey(c=4) [0,04
0.232
0.144
Tukey(c=3) [ 0.08
0.240
0.157
0.219
0.130
method
bimodal
I 0.05
Table i Numerical characteristicsof algorithms for circlefittingto simulated data.
This is done in a way generalizing [17] by multiplying the corresponding circle equations. The LSF estimation of all parameters requires the search for the global minimum of the non-linear functional
L(a, b, c, d, Rt, R2) = E wiF2(xi' yi; a, b, c, d, R1, I~2), i
where
;~(~, y~; a,
b, c, d, a l , R 2 / =
[ ( ~ - a)~ + (u~ - b) ~ - R ~ ] [ ( ~
- c) ~ + (u~ - d) ~ - a g ]
for two circles. A linearization of L is done similarly to ref. [8]. Details can be found in ref. [18]. We tested the bimodal weight function in a numerical experiment and compared it against two unimodal weights: Huber's [19], Tukey's [20] and the constant one w - 1 for the classical LSF. Results for the single circle experiment are summarized in the table 1, where P! is the probability of a complete failure, when a and b are off more than one bin size. All rms values are given in bin-size, the half-width of the signal shape is also equal to bin-size (tr0 - 1). The noise distribution is uniform with signal/noise ratio -- 1. Only Tukey's unimodal biweights with c = 4 stands the competition with our bimodal one. The numerical results for two circles are shown in the table 2. Simulation details can be found in ref. [18]. Czech. J. Phys. 49/$2 (1999)
157
G . A . Ososkov
Besides to be as close to real raw data as possible, a G E A N T simulation was made of pad structure for Chcrenkov rings. We superimposed them into the real background of the CERES Pb-AU'95 RICH data. Results of the ring radius accuracy were comparable (not worse) than those presented in ref. [8].
I distance ] Pl
rmR~,b I ~s~,~ I
I
1
10.07
0.429]0.430
]
10
] 0.05
0.333 I 0.390 I
I
Table 2 Numerical characteristics for two circles fitting to simulated data using bimodal weights. Distances are given in bin-size
4.2.
Particle identification
Considering the RICH detector as a part of a bigger experimental setup, we come to the natural assumption that the centers of Cherenkov rings and particle m o m e n t a are approximately known in advance from prior trajectory measurements. Therefore, the basic physical problem of particle identification reduces to making a decision about the most likely choice of a ring radius from two or more possible values. If the accuracy of those prior measurements is not sufficient enough, one can use them as initial values for the robust technique described in the previous section, m order to improve the ring parameter accuracy. In the conventional formulation. when all hits on Cherenkov rings are determined and ring parameters are known, the corresponding likelihood ratio test (LRT) methods are well developed (see, e.g., ref. [3]). More popular are methods with counting the number of pads in fiducial areas calculated for alternative rings, as it was done in ref. [21]. Below we call such methods as PCFA (pad counting in fiducial area). We propose a similar approach, but with calculating t h e s u m o f t h e a m p l i t u d e s of pads occurred in a fiducial area around a tested circle. The detailed arguments proving this test called ASFA (amplitude summing in fiducial area) are given in [22]. An easy explanation is as follows: the sum of amplitudes in a fiducial area should be much bigger for the circle corresponding to the true hypothesis than for the circle corresponding to the alternative hypothesis, therefore the ratio of the first and the second sums must be greater than a chosen constant, A comparative study of both tests PCFA and ASFA fulfilled in [22] shows that the PID-error probability of ASFA method is l(/r what is three times better than for the PCFA method for the same level of misidentification probability (2.3%). Czech. J. Phys. 49/$2 (1999)
158
Novel Approach in RICH Data ttandling 5.
Conclusion
On the basis of the mathematical model of the RICH data the main stages of data processing are considered. On the first stage of measurement clustering in order to obtain photon hits two new methods of hit accuracy improving are described: numerical approximation method and the wavelet analysis method. The central part of the paper is focused on the approach of direct processing of raw RICH data, which is especially actual for the high granularity RICH detectors like CERES and COMPASS. Two problems related to direct processing of raw RICH data are considered depending on the prior knowledge of the Cherenkov ring centers and particle momenta: robust fitting of one or simultaneously two overlapping rings when the center knowledge is not sufficient; hypothesis testing of one particular radius from the known set of Cherenkov ring radii in the case when this knowledge is available. A robust technique was developed that uses the raw data and takes into account such an essential information as the signal amplitudes measured for each pad. A comparative study on simulated data shows the satisfactory efficiency of the proposed methods. The author would like to thank P.Gl~sel and I.Tserruya from the CERES Collaboration for fruitful discussions and Elena Kolganova for her help in preparing this paper. References [1] [2] [3] [4] [5] [6] [7]
W.Adam et al., Nucl. Instr. and Meth., A338 (1994) 284. R.Baur et al., Nucl. Instr. and Meth., A343 (1994) 87. U. Mfiller et al., ibidem,, A343 (1994) 279-283. J. Bs et al., ibidem, 273-275. A. Di Mauro et al., ibidem, 284-287. J.Seguinot, T.Ypsilantis, ibidem, 1-29. L. Muresan et al., Deformable Templates for Circle Recognition, JINR Rapid Comm.
1181]-9r. [8] G. Agakichiev, et al., Nucl. Instrum. Meth., A371 (1996) 243-247. [9] t-I.Agakishiev et. al., Effective pulse resolution algorithms ]or detector8 with gaussian-like signal shape. Commun. of JINR E10-97-105, Dubna, 1997. [10] I. Daubechies Editor, Proc.of Symp. in Appl. Math, 47 1993. [ll] G.Ososkov, A.Shitov, Commun. of JINR Pll-97-347, Dubna, 1997 (in Russian). it2] M.V.Altaisky, E.A.Kolganova, V.E.Kovalenko, G.A.Ososkov, Proc. of the International conference of SPIE -The International Society for Optical Engineering, Proc. SPIE'96, 2847 (1996) 656-664. Czech..J. Phys. 49/$2 (1999)
159
G, A. Ososkov : Novel Approach i~J I~I("H I)a~a ttandling [13] G.Ososkov. A.Shitov. (;avssiarz Wat.elet F~'aturt s and Yhcir Applications for .4naly.sis of Discretized Si9nals, Proc. of .XITCP'g8 Uonf. I)ubna 1998 (to be publ. at
Comp. Phys. Comm.) [14] A. Linka et al.. Proc. COMPSTAT'98. Bristol. UI{. 1998. 243. [15] N. Chernov, E. Nolganova, G. Ososkov. Commun. of JINR E10-95-468, 1995. [16] B. Lasiuk, D. Lyons. G. Ososkov. T. Ullrich, Proc. of CHEP'98, Chicago, 1998. (in print ). !17] G. Ososkov et al.. Proc. 6th Joint EPS-APS Conf. on Phys. Comp., PC'94, Lugano, 1994. 361-364. ~18] N. Chernov. E. Kolganova, G. Ososkov, Proc. 8th Joint EPS-APS Conf. on Phys. Comp.. PC'96, Krakov, 1996, 230-233. [19] P. Huber. Robust statistics, J. WiUey & Sons, New York, 1981. [20] F. Mosteller, W. Tukey. Data anal!Isis and regression: a second course in statistics, Addison - Wesley, 1977. [21] G. Kunde et al. S T A R R I C H Proposal, BNL. 1998. [22] E.Kolganova, G. Ososkov, This Proceedings, p. 169.
:[~0
Czech. J. Phys. 49/$2 (1999)