Fresenius' Journal of
Fresenius J Anal Chem (1992) 342:764-768
@ Springer-Verlag 1992
Quality in analytical chemistry Edzard Hartmann Institute of Biometry, Schering AG, W-1000 Berlin 65, Federal Republic of Germany Received July 9, 1991
Summary. In order to enhance the quality of analyticochemical statements it is common practice, to optimize the analytical method. But furthermore it is also necessary ,to fit the design of the study including sampling procedures and calibrations to the aims of the investigation and its consequences as close as possible. The presentation of results should mention all premises which were not empirically tested. To prevent misinterpretation of the results, their respective field of application should be specified. As regards the characterization of methods, it must be explained whether it is to be valid for a specific analysis series, for a measuring system or for the method as such. Thus, the quality in analytical chemistry is measurable in terms of the scientific nature of the statements; that is in terms of their degree of objective verifiability and in terms of their deduction by recognized methods of all disciplines involved, including statistics.
Introduction If in analytical chemistry quality is the topic, it is usually the quality of material or nonmaterial products that is meant. Quality is usually understood as stipulated by DIN Standard 55350, Part 11, to be the actual property of a product in relation to the required property. However, this paper deals with the quality of analyticochemical statements. Principal importance is often attached to analyticochemical statements within the framework of decision processes. This is certainly the case for decisions related to environmental protection or chemical safety. With increasing frequency, the analyst finds himself in areas of conflict which result from contradictory responsibilities to different reference authorities. The analyst bears a responsibility not only to his client but also to society. The fact that in extensive practical areas the analyst's clients only grant him the role of performing a passive service function does not absolve him from his responsibility to society. However, in order to bear this responsibility, the analyst not only needs sound specialist knowledge of chemistry; he must also have basic knowledge of scientific theory, decision theory and statistics, to mention only the most important. The study of chemistry at most German universities involves almost no statistics. Whether or not the analyst learns to appreciate the importance and necessity of statistical statements and whether or not he acquires the right approach to the qualified elaboration of methodology is something
which is left to his own initiative: the autodidactic approach is indeed difficult. As a rule, scientific theorists and statisticians are not familar with the special features of chemical analysis and, therefore, can offer only limited support. A no-mean's-land exists in which rather strange things happen.
Quality of terminology A complete explanation of the analyst's job leads at once to a confrontation with problems of comprehension resulting from incompleteness of analyticochemical vocabulary. For example, in order to describe a sampling procedure, one needs not only the term sample but also a modifying term which specifies the population from which the sample is taken. In mathematical statistics, that is in the world of numbers, this population is clearly defined: one imagines that observation leads to realizations of random variables which are called data. The matter is also clear in biometry: Random samples from populations are taken, and the resulting observations or measurements comprise a data set. But in analytical chemistry the situation is not well defined. There is no agreed term to designate the population. DIN Standard 55350, Part 11, does indeed use the terms total output and test lot but not all materials originate from production processes. The terms material and product used in Fig. 1 are also imprecise because they can designate both
MATHEMATICAL STATISTICS
BIOMETRICAL MODELS
RANDOM VARIABLES
POPULATION
REALISATION
DATA
SAMPLIM(
SAMPLE
Fig. 1. Quality of terminology
ANALYTICAL CHEMISTRY
REAL MATERIAL, REAL PRODUCT
SAMPLIN(
SAMPLE
765 a property of a substance and also a real population. In the following, the basic chemical population will be designated as a real material. Ambiguous uses of the terms value, test value and parameter are further examples of inexact terminology - and not only in the chemical literature. These considerations should not be interpreted as a plea for standardization of all terms. Sometimes, it is precisely the vagueness of language which is an important aid to comprehension when the meaning of terms is obvious from the context. Quality of objectives
Objectives can be inadequate, if they have been defined in a one-sided manner. For example, in the detection of pollutants in effluents analysts very often consider only the type I error without taking into account the type II error.
RESULT OF THE ANALYSES NO CONTAMINATION NO CONTAMINATION
TRUE STATE
CONTAMINATION
= =
probability probability
+ TYPE 2 ERROR
CONTAMINATION TYPE 1 ERROR +
of the type 1 error of the type 2 error
Fig. 2. Type I and Type 2 error
In the case of detection, the type I error consists in falsely asserting that contamination by the pollutant is present. Many scientists think that the lower they fix the probability c~of this error, the more exact is their method. But this is a very lopsided point of view. The lower the probability ~ is fixed, the greater becomes the probability/3 of the type II error of falsely stating no contamination. Thus by using a small c~one, as it were, subjects the producer of the pollutant to nature conservation but not the environment. This manner of thinking is also reflected very clearly in the definition of the detection limit according to Kaiser [1]" to be on the safe side, it was even suggested fixing the detection limit at sixfold the standard deviation of blank value determinations, which in the case of normal distribution is equal to an c~ of 0.002! However, environmental protection today does not allow such one-sided views. Quality of presentation of results
As every other scientist, the analyst requires his results to be scientific. The claim that a statement is scientific is based on two requirements: - on the generality of the statement and - on the deduction of the statement from directly examinable facts by an outstanding method of conclusion.
General validity can be explained most simply by considering both of these words: A general statement not only refers to singular events but, rather, contains variable components and can be extrapolated to other conditions within certain limits; this makes the statements generally
valid. Specifically, analytical statements for a sample must be able to be extrapolated to the real material from which it was taken. The quality of deduction is somewhat more complex. A trivial demand is that the resulting statement should be true. However, it is not possible to make the term truth precise. Instead scientific statements are required to have objectivity; that is the statements should be deducible independently of the whim of the individual scientist. It is the verifiability which makes a general statement into a generally valid statement. The scientific nature of statements is also subject to methodological conditions: The first condition is that there exists an unambiguous linguistic description. The second methodological condition is that when deducing scientific statements, rules should be used which are based on an outstanding axiomatic theory. The problematical nature of this requirement is a broad topic which would go beyond the framework of this paper; it principally consists of the fact that there is no methodology at all which, using scientific methods, could be substantiated as being the only correct one. However, it must be borne in mind that the demand for substantiation of the outstanding methodology by means of itself would be circular reasoning. Therefore, Feyerabend's [2] assertion that there is no rational reason for behaving rationally is unreasonable, Nevertheless, at some point every basis of scientific theory becomes a dogma. If one reviews the scientific literature on the verifiability of results, one will quickly ascertain that hardly any scientific results are published today! A detailed description of the individual procedural steps is indeed taken for granted but, as a rule, verifiability of the results fails because there is no description of whether and, if so, how many multiple measurements of the same sample preparation were performed, how the sample was divided and whether, how and at what time intervals the used method was calibrated. The methods for validation and processing of original data are seldom described; if an automatic apparatus is used, the analyst himself will often not know the nature of these methods because some manufacturers of automatic apparatus do not bother to the methods sufficiently. Here compliance with GLP guidelines can only be of limited assistance. Standard operating procedures only make the procedural steps verifiable in respect of the detail with which they are described; but to what extent the steps are described at length and completely is left to the initiative of the author. If he makes full use of the efficiency of the GLP guidelines, they can enhance the verifiability of the conclusions; but if he does not make full use of the possibilities available GLP remains an empty shell. The same also applies to the relevant D I N standards D I N ISO 9000 to 9004 concerning quality assurance systems. Fundamental problems concerning the interpretation of results may also emerge, when an objective is too ambitious. An example is the purely empirical determination of absolute substance concentrations. When viewed in the light, this objective cannot be upheld because absolute substance concentrations can only be determined on the basis of important premises which are not empirically verifiable. Although this problem is of a general nature, it will be explained using the example of trace analysis: The lower the concentration of the substance to be detected is in the carrying matrix, the more difficult are the processes of detection and quantitative analysis.
766 The carrying matrix can have a very complex composition and can contain substances which also react to the analytical method or otherwise interfere with sample preparation; also, in the limit range of low concentrations it is increasingly more problematical to completely exclude the presence of the substances of interest in real materials. Based on the idea of so-called omnipresence of all substances it is rather to be expected that a true concentration zero does not occur in real materials. Based on the idea of so-called omnipresence of all substances it is rather to be expected that a true concentration zero does not occur in real materials. At any rate, both problems imply that the true zero point of the concentration axis cannot be empirically determined and thus absolute substance concentrations cannot be expressed as numbers. Therefore, especially in the trace range only comparative tests are conceivable in which only differences between substance concentrations instead of absolute substance concentrations can be determined. This happens in such a way that, firstly, a calibration function is estimated which completely represents the analytical method, that is including the entire process of sample preparation. It is not correct to perform the calibration on the basis of a dilution series of the pure standard reference material. Rather, one must take a blank matrix as the basis which - apart from the substance to be determined - must have the same structure as the carrying matrix of the real analytical material; if need be, the blank matrix must be reconstructed artificially. A
Furthermore, of course, there is premise that the substance concentrations in the standard reference materials used for the calibration are correctly stated. Caution must also be observed in the case of this premise and not only in the trace range. Thus here the analyst is faced with a fundamental dilemma: he must decide whether he will remain on the safe side and only state differences between substance concentrations, or whether he will "fix the zero by praying".
Quality of range of application Stating the range of application is an important quality characteristic of analyticochemical conclusions, both in the characterization of real materials and in the characterization of methods. As regards the characterization of real materials the most important question consists in what is actually understood by this. That is, it must be clarified for what stage of the object hierarchy the analysis is to be valid. Here is an example: OPERATION PROCEDURE
i
...
i
...
I
BATCHES
i
...
j
...
J
CONTAINERS
1
...
k
I
...
1
...
L
SAMPLES
I
...
r
...
R
MEASUREMENTS
...
K AMPULES
CORCENTRATION DIFFERENCE X = C - co
0
co co
CONCENTRATION C =
CONCENTRATIONIN THE BLANK RATRICE
Fig. 3. Calibration function Y = f(X) In Fig. 3 the measurand Y being directly measurable is shown in the ordinate. In the abscissa two attributes are shown: - The concentration C, the origin of which lies somewhere beneath the concentration Co in the blank matrix. - The concentration X, the origin of which coincides with Co. By means of the calibration function the measured signal which the measuring system yields under the influence of the prepared sample is converted to the difference between the substance concentrations of the analyzed sample and the blank matrix. In order to arrive at absolute substance concentrations, the premise must be set up that the blank matrix contains the substance to be determined in a concentration which is irrelevant for the purpose concerned.
Fig. 4. O b j e c t h i e r a r c h y
Ionic X-ray contrast media solutions are manufactured batchwise, filled in ampuls, sterilized by means of dry heat in autoclaves, grouped together and packed in containers. For iodide determination, several samples can be taken from one ampul and prepared. Each prepared sample can be subjected to multiple measurements. The operation procedure is at the top of the object hierarchy. Of course, the batches manufactured in accordance with this operation procedure exhibit variation of the iodide content. Differences exist within the batches between containers and ampuls, within the ampuls between the samples and within the samples between the repeated measurements. If one single determination of one sample from one ampul is carried out, at first the analytical result is only valid for the measurement itself. If the objective was to obtain a conclusion for the ampul, one must check whether several samples must be prepared and subjected to measurement. The required numbers of measurements and samples can be calculated from the degree of precision aimed at, the variance components for preparation and measurement and the pertinent costs. If the conditions are extremely favorable,
767 that is if precision requirements are low and there is little variation, a single measurement per sample and a single sample might be sufficient to arrive at a conclusion for the complete ampul. Similar considerations will have to be made if it is intended to characterize a container, a batch or, indeed, the operation procedure itself. It is best to conduct pilot experiments in order to estimate the variance components. Very often false precision measurements are stated in publications. Here is an example: In order to be able to determine the value of a ship's cargo of iron ore, samples must be taken at fixed time intervals when the ship is unloaded. The individual samples are combined to make a gross sample which is then comminuted and homogenized in a multistep process. The purchaser received one half of this gross sample, the seller the other half. Both of them carry out K independent determinations. If there are no remarkable differences in the mean values for the ore content, the mean value, the standard deviation s and the confidence interval are calculated from the 2K analytical results: ts ?---<#_<'?+--
ts
This confidence interval which is supposed to include in the long run the true ore content with a probability of 0.95 is taken as the basis for the calculation. N o w the question is whether or not the individual samples should be combined to form a single gross sample in order to estimate the standard deviation. Here it must be borne in mind that at least three variance components can be expected. Firstly, the ore content of a sample does not agree with the average ore content of the cargo. Different samples would have yielded different ore contents. We describe the distribution of the ore contents of different samples by means of a normal distribution having the variance ~ . A second variance component o-~ described the variation of the results from different laboratories as regards the contents of the samples and a third variance component o-~ describes the variation of the analytical results within the samples and laboratories. Therefore, it would have been better to analyze several r a n d o m samples individually and to estimate the three variance components by means of a nested analysis of variance.
VAR(Y)
%2 -
%2
I
+
~
%z +
(I)
These estimates are then inserted in the first formula for the variance of the mean value. However, if a single gross sample is prepared, the statistic O-p 2 cannot be estimated at all. Thus, by rights, one only analyzes the gross sample instead of the ship's cargo. However, a second error was also made: for example, for I = 1 sample, J = 2 laboratories and K = 3 independent analyses per laboratory and sample the second formula in Fig. 5 is obtained. Nervertheless, as described, the unsplit overall standard deviation o- of all 2K determinations was used. The expected value of this standard deviation of the mean is stated in the third formula. Here o-2 does not go into it at all and o-2 only to one-fifth. Both errors cause the analytical chemist to assume too great a precision o f the result. Similar considerations must be made as regards the characterization of analytical methods. Firstly, in this connection it is practical to introduce the term analytical series: An analytical series comprises all analyses in the evaluation of which the result of the same calibration is taken as the basis. This term is more practical than the term repeat condition used in D I N 55350, Part 13: "Conditions valid for obtaining test results independent of each other, consisting in the repeated application of the stipulated test method on the identical object by the same observer within short time intervals using the same apparatus at the same place (in the same laboratory)." The repeat condition is defined so narrowly that in some cases it is questionable whether the calibration at all can be carried out under such conditions. As a rule, different analytical series which are carried out on the same measuring system yield different estimates for the parameters of the calibration function, for the residual variance and for the detection limit. PROCEDURE
LABOR 1
LABOR 2
LABOR 3
SYSTEM i
SYSTEM 2 . . .
SYSTEMJ
ANALYTICAL SERIES I
ANALYTICAL... ANALYTICAL SERIES 2 SERIES K
la'--'K
Fig. 6. Levels ofanalytical parameters
FOR
I = 1,
VAR(?)
E
s2 I -j~
=
=
I
=
J
=
NUMBEROF SAMPLES NUMBEROF LABORATORIES
K =
NUMBEROF MEASUREMENTS
O= 2
and K = 3:
~pZ +
o,_2 +
GLZ 1-0
Fig. 5. Variance of the mean
+
OE2 --6
GE2
6
In addition, it must be borne in mind that these indices cannot be simultaneously valid for all levels of the hierarchy which consists of analytical series, measuring systems, test centers and methods. Therefore, it must be stated what the indices are to be valid for: (2)
(3)
- the analytical method per se, - a population of selected test centers, one test center, - one specific real measuring system, or only one specific analytical series. -
768 As distinct from the analytical series, the indices crossing series limits take additional variance components into account. Therefore, detection limits crossing series limits are always higher than the respective indices for the individual analytical series. Indices crossing measuring system or indeed test center limits which can be estimated from ring experiments also take into account the variance between the different measuring real systems and test centers, and again they are higher.
Concluding remarks If the analyst wants to enhance the quality of his statements, he will of course strive to improve the sensitivity, precision and specificity of his analytical methods. He will also try to recognize systematic interference factors, give due consideration to them and compare his efficiency within the framework of ring experiments with other laboratories. But he should not merely adhere to the optimization of analytical methods. It is very important for the analyst to have a detailed knowledge of the objective and the consequences of his analyses. Only then he is able to choose a method for taking and preparing the samples which does justice to the problem under consideration. Thus he should not simply analyze the samples "as delivered to the laboratory".
As regards the presentation of results the analyst should mention all premises which were not empirically tested or indeed were not verifiable. These include the mentioned principles concerning the quality of the blank matrices and the standard reference material. To prevent possible misuses of his conclusions, the analyst must also give an exact specification of their field of application. If he characterizes real materials, he must state the type of generalization which can be inferred from the prior knowledge and the sampling model. Furthermore, the analyst should not simply give the standard deviation without details on what variance components it represents. For example, it must be possible to recognize whether the standard deviation characterizes the complete method including the sampling procedure. Finally, as regards the characterization of methods it must be explained whether the stated indices are to be valid for a specific analysis series, for a real measuring system, for a laboratory or for the method as such.
References 1. Kaiser H (1966) Fresenius Z Anal Chem 216:80-94 2. Feyerabend P (1980) Erkenntnis •r freie Menschen. Suhrkamp 1011, neue Folge Bd 11, Frankfurt