Int J Adv Manuf Technol (2005) 27: 119–127 DOI 10.1007/s00170-004-2142-3
ORIGINAL ARTICLE
C.L. Huang · T.S. Li · T.K. Peng
A hybrid approach of rough set theory and genetic algorithm for fault diagnosis
Received: 9 July 2003 / Accepted: 13 February 2004 / Published online: 26 January 2005 © Springer-Verlag London Limited 2005 Abstract This paper proposes an integrated intelligent system that builds a fault diagnosis inference model based on the advantage of rough set theory and genetic algorithms (GAs). Rough set theory is a novel data mining approach that deals with vagueness and can be used to find hidden patterns in data sets. Based on this approach, minimal condition variable subsets and induction rules are established and illustrated using an application for motherboard electromagnetic interference (EMI) test fault diagnosis. This integrated system successfully integrated the rough set theory for handling uncertainty with a robust search engine, GA. The result shows that the proposed method can reduce the number of conditional attributes used in motherboard EMI fault diagnosis and maintain acceptable classification accuracy. The average diagnostic accuracy of 80% shows that this hybrid model is a promising approach to EMI diagnostic support systems. Keywords Data mining · Data reduction · Electromagnetic compatibility · Electromagnetic interference · Genetic algorithm · Rough set theory · Rule generation
1 Introduction Due to the global proliferation of electronic devices, it is becoming increasingly important that electromagnetic compatibility (EMC) be maintained in civilian settings. Besides producing potential interference, electrical products are exposed to ambient electromagnetic interference (EMI) from other pre-existing sources. The fast electronic devices in these new smart prodC.L. Huang (u) Department of Information Management, National Kaohsiung First University of Science and Technology, 2, Juoyue Rd., Nantz District, Kaohsiung 811, Taiwan, R.O.C. E-mail:
[email protected] Tel.: +886-7-6011000 ext. 4127 Fax: +886-7-6011042 T.S. Li · T.K. Peng Department of Industrial Engineering and Management, Ming Hsin University of Science and Technology, Hsinchu, Taiwan, R.O.C.
ucts generate intense radiated EMI. Most countries now require that imported products comply with the EMI laws of the USA and European Community (EC) standards. Manufacturers are required to design and test their products so that they will not malfunction when subject to external EMI sources. For example a 100 MHz computer contains an electronic clock that steps the microprocessor through its program. In this case, the clock frequency falls within the frequency spectrum allocated in the USA for FM radio broadcasting. If PC manufacturers did not take precautions, PCs would interfere with nearby radio receivers. Harmonics or multiples of this frequency could, if not subdued, cause interference to other radio receivers such as those used by emergency medical personnel and television receivers. It is therefore incumbent upon manufacturers of digital electronic devices to guarantee that their products will be compatible with other electronic devices [1–5]. EMI is one of the most important quality characteristics in motherboard assembly. The interaction among all devices complicates the fault situation so engineers cannot analyze the problem within a limited time span. Therefore, a non-trivial process for identifying valid, useful and ultimately understandable data patterns is a practical problem for motherboard EMI fault diagnosis. Data mining is an emerging area in computational intelligence that offers new theories, techniques and tools for processing large volumes of data. It has gained considerable attention among practitioners and researchers. The growing volume of data available in digital form has accelerated this interest. Data mining relates to other areas, including machine learning, cluster analysis, regression analysis, and neural networks. Kusiak [6] grouped data-mining theories or techniques into two categories. The first was the neural network and regression approach, creating a model based on the training data set. The second category involved machine learning algorithms generating a number of models in the form of decision rules. The neural network and regression approaches determine features that are common to a training data set, while the decision rules (models) created by machine learning are explicit. Recently, data mining approaches have been successfully applied
120
to industrial data analysis to derive useful and comprehensive knowledge [7–11]. One of the data mining theories is the rough set theory used for reducing a data set, finding hidden data patterns and generating decision rules. This theory has many important advantages such as finding minimal sets of data, evaluating data significance, generating minimal sets of decision rules from data, offering a straightforward interpretation of the obtained results and parallel processing by combining several algorithms [12– 14]. In this study, we propose an integrated approach, combining rough set theory with GAs [15] by selecting a group of attributes capable of representing motherboard EMI test evidence. The proposed approach deals with two problems; selecting the minimal subset of attributes while maintaining acceptable classification accuracy and developing the decision rules for predicting a new type of motherboard. It is usually very difficult to select a group of effective attributes to fully reflect system security because of the highly non-linear nature of motherboard EMI fault diagnosis. The selected attributes seldom provide adequate knowledge to accurately map the interclass boundary, making the inter-class boundary usually “rough”. In other words, some cases close to the boundary are practically unclassifiable based on the selected attributes. Therefore, the motherboard EMI fault diagnosis is actually a rough classification problem where there are small overlaps between the different classes. The rest of this paper is organized as follows: Sect. 2 reviews previous rough set theory and genetic algorithm applications. Section 3 introduces rough set theory and genetic algorithm. Section 4 proposes an integrated approach combining the rough set theory with a genetic algorithm. In Sect. 5, a motherboard EMI fault diagnosis example is illustrated and the computation results are summarized. Section 6 provides a comparison with the decision tree induction algorithm and conclusion.
The rough set theory has been successfully applied in many real-world problems in medicine, pharmacology, engineering, baking, financial and market analysis. The rough set theory is important for various engineering applications, such as machine diagnosis, material science and process control. Shen et al. [20] used rough set theory to diagnose the valve fault for a multicylinder diesel engine. Many other application fields, e.g., image processing, pattern recognition and feature selection, have been extensively explored [21–25]. In the past decades, GAs have received much attention from practitioners and researchers working on machine learning [16]. Typically, a GA-based approach takes advantage of the unique robust GA search engine to extract useful information or knowledge from the search space. This paper proposes an intelligent system for motherboard EMI fault diagnosis. The proposed method is based on an integrated technique that combines the strengths of rough set theory and GAs. In the following sections the basic notions of rough set theory and GAs are presented.
3 Rough set theory and genetic algorithm 3.1 Rough set theory Rough set theory (RST) handles uncertain information using two approximations, that is, the upper and lower approximations. The approximation space of a rough set is the classification of the domain of interest into disjoint categories [19]. Such a classification refers to the ability to characterize all the classes in a domain. The upper and lower approximations represent the classes of indiscernible objects that possess sharp descriptions on concepts but with no sharp boundaries. The basic philosophy behind RST is based on equivalence relations or indiscernibility in the classification of objects [26]. 3.1.1 Information system
2 Literature review Rough set theory (RST) introduced by Pawlak [16] in 1982 provides a novel data-mining tool that deals with vagueness and uncertainty in a database or information system. It has since been widely investigated in such areas as machine learning, knowledge acquisition, decision analysis, knowledge discovery and pattern recognition [17–19]. RST can be used in “rough classification” problems to handle the imprecise inter-class boundary. The philosophy of this method is based on the assumption that with every object some information (data, knowledge) can be associated. Objects characterized by the same information are indiscernible in view of the available information. The nondiscernibility relation generated in this way is the mathematical basis for the RST. The main advantage of RST is that it does not need any a priori knowledge, such as probability in statistics, or the basic probability assignment in the Dempster-Shafer theory of evidence or membership grade in the fuzzy set theory.
By an information system we understand the 4-tuple S = U, Q, V, ρ, where U is a finite set of objects, Q is a finite set of attributes, V = ∪q∈Q Vq and Vq is a domain of the attribute q, and ρ : U × Q → V is a total function such that ρ (x, q) ∈ Vq for every q ∈ Q, x ∈ U, called an information function. Let S = U, Q, V, ρ be an information system and let P ⊆ Q and x, y ∈ U. We say that x and y are indiscernible by the set of attributes P in S iff ρ (x, q) = ρ (y, q) for every q ∈ P. Thus every P ⊆ Q generates a binary relation on U which will be called an indiscernibility relation, denoted by IND(P). Obviously, IND(P) is an equivalence relation for any P. Equivalence classes of IND(P) are called P-elementary sets in S. The family of all equivalence classes of relation IND(P) on U is denoted by U| IND(P) or, in short, U| P. Des p (X) denotes a description of P-elementary set X ∈ U| P in terms of values of attributes from P, i.e., Des p (X) = {(q, v) : p(x, q) = v, ∀x ∈ X, ∀q ∈ P} .
(1)
121
3.1.2 Approximation of sets
3.1.5 Decision tables
Let P ⊆ Q and Y ⊆ Q. The P-lower approximation of Y , denoted by PY , and the P-upper approximation of Y , denoted by PY , are defined as:
An information system can be seen as a decision table assuming that Q = C ∪ D and C ∩ D = φ, where C are called condition attributes, and D, decision attributes. Decision table S = U, C ∪ D, V, ρ is deterministic iff C → D; otherwise it is non-deterministic. The deterministic decision table uniquely describes the decisions to be made when some conditions are satisfied. In the case of a non-deterministic table, decisions are not uniquely determined by the conditions. Instead, a subset of decisions is defined which could be taken under circumstances determined by conditions. From the decision table, a set of decision rules can be derived. Let U| IND(C) be a family of all C-elementary sets called condition classes, denoted by X i (i = 1, . . ., k, where k is the number of U| IND(C)). Let, moreover, U| IND(D) be the family of all D-elementary sets called decision classes, denoted by Yi (i = 1, . . ., n, where n is the number of U| IND(D)). DesC (X i ) ⇒ Des D (Y j ) is called the (C, D)-decision rule. The rules are logical statements “if . . . then . . . ” relating descriptions of condition and decision classes. The set of decision rules for each decision class Y j (j = 1, . . . , n) is denoted by rij . More precisely, rij = DesC (X i ) ⇒ Des D (Y j ) : X i ∩ Y j = φ, i = 1, . . . , k . Rule rij is deterministic iff X i ⊆ Y j , and rij is non-deterministic otherwise.
PY = ∪ { X ∈ U| P : X ⊆ Y }
(2)
PY = ∪ { X ∈ U| P : X ∩ Y = φ .}
(3)
The P-boundary (doubtful region) of set Y is defined as Bn p (Y) = PY − PY . Set PY is the set of all objects from U which can be certainly classified as elements of Y , employing the set of attributes P. Set PY is the set of objects from U which can be possibly classified as elements of Y , using the set of attributes P. The set Bn p (Y) is the set of objects which cannot be certainly classified to Y using the set of attributes P only. With every set Y ⊆ U, we can associate an accuracy of approximation of set Y and P in S, or in short, accuracy of Y , defined as: a p (Y) =
card(PY) card(PY)
.
(4)
3.1.3 Approximation of a partition of U Let S be an information system, P ⊂ Q, and let Ψ = {Y1 , Y2 , . . . , Yn } be a partition of U. The origin of this partition is independent on attributes from P; it can follow from solving a sorting problem by an expert. Subsets Yi , i = 1, . . ., n, are categories of partition Ψ . By P-lower and P-upper approximation ofΨ in S, we mean sets PΨ = {PY1 , PY2 , . . . , PYn } and PΨ = PY1 , PY2 , . . . , PYn , respectively. The coefficient
3.1.6 Decision support using decision rules
is called the quality of approximation of partition Ψ by set of attributes P, or in short, quality of sorting. It expresses the ratio of all P-correctly sorted objects to all objects in the system.
Decision rules derived from a decision table can be used for recommendations concerning new objects. Specifically, matching its description to one of the decision rules can support the classification of a new object. The matching may lead to one of four situations [26]: 1. The new object matches one deterministic rule; 2. The new object matches more than one deterministic rules suggesting, however, the same decision class; 3. The new object matches one non-deterministic rule or several rules suggesting different decision classes; 4. The new object does not match any of the rules.
3.1.4 Reduction of attributes
3.2 Genetic algorithms
We say that the set of attributes R ⊆ Q depends on the set of attributes P ⊆ Q in S (denotation P → R) iff IND(P) ⊂IND(R). Discovering dependencies between attributes is of primary importance in the rough set approach to knowledge analysis. Another important issue is that of attribute reduction, in such a way that the reduced set of attributes provides the same quality of sorting as the original set of attributes. The minimal subset R ⊆ P ⊆ Q such that γ P (Ψ ) = γ R (Ψ ) is called Ψ -reduct of P (or, simply, reduct if there is no ambiguity in the understanding of Ψ) and denoted by REDΨ (P). Note that an information system may have more than one Ψ -reduct. Intersection of all Ψ -reduct is called the Ψ -core of P, i.e., COREΨ (P) = ∩REDΨ (P). The core is a collection of the most significant attributes in the system. It can also be empty.
Genetic algorithms (GAs), an optimization methodology based on a direct analogy to Darwinian natural selection and genetics in biological systems, is a promising alternative to conventional heuristic methods [12]. GAs work with a set of candidate solutions called a population. Based on the Darwinian principle of “survival of the fittest”, the GA obtains the optimal solution after a series of iterative computations (see Fig. 1). This characteristic, associated with their stochastic nature, enables GAs to deal with large search spaces randomly and efficiently. The GA, a local search technique, can find solutions for a wide range of applications. To achieve the desired response, GAs generate successive populations of alternate solutions that are represented by a chromosome, i.e., a solution to the problem, until acceptable results are obtained. In this manner, a GA
n
γ P (Ψ ) =
card(PYi )
i=1
card(U)
(5)
122
Fig. 1. The evolutionary algorithm Fig. 3. An integrated system architecture
ployed to reduce the input attribute set and conduct the optimization operation of GA. The raw data collected from a motherboard manufacturing plant were stored and subsequently forwarded into the system for further analysis. The pre-processor and discretization conduct the following tasks:
Fig. 2. Genetic operations
can quickly yield a successful outcome without examining all possible solutions to the problem. A procedure using a fitness function assesses the performance of the solution. The reproduction, crossover, and mutation functions are the main operators that randomly impact the fitness value. Chromosomes are selected for reproduction by evaluating the fitness value. The fittest chromosomes are saved and copied into the next generation. Crossover, the critical genetic operator that allows new solution regions in the search space to be explored, is a random mechanism for exchanging genes between two chromosomes. In mutation the genes may occasionally be altered, i.e., an “0” becomes an “1” or vice versa (see Fig. 2). During the search, the mutation must avoid prematurely losing important information, although they are typically set at an extremely low value, 0.05 or lower.
4 Methodology – an integrated approach The integrated approach is shown in Fig. 3. It comprises three main modules, namely a pre-processor and discretization, GAbased reduct and reducts selector, and rule generator and selector. A rough set based software ROSETTA [27], developed by a team at the Norwegian University of Science and Technology, is em-
1. 2. 3. 4.
Collect input data. Identify the condition and decision variables and their values. Perform missing data analysis and redundancy checks. Discretize and re-organize the new data set with no superfluous observations for subsequent use.
After the discretization operation, an information table is sent to the integrated system for the GA-based reducts and selector. At this stage, it is important to choose a good scheme for chromosome representation and define a reasonable fitness function for the GA. For simplicity, this approach applies the basic binary chromosome representation. The purpose for applying a GA approach here is to discover those test settings that best characterize the EMI noise intensity. Thus, the fitness value of a chromosome can be described by its probability to correctly classify the original data. A genetic algorithm for computing the minimal hitting sets, as described by Vinterbo and Øhrn [28] is implemented. The algorithm has support for both cost information and approximate solutions. The algorithm’s fitness function f is defined below, where S is the set of sets corresponding to the discernibility function. This parameter α defines a weighting between the subset cost and hitting fraction, relevant in the case of approximate solutions (see ROSETTA user’s manual [27]): cos t(A) − cos t(B) cos t(A) |[s in S| S ∩ B = 0]| + α × min ε, . |S|
f(B) =(1 − α) ×
(6)
123
The subsets B of A found through an evolutionary search driven by the fitness function and that are “good enough” hitting sets, i.e., that have a hitting fraction of at least ε, are collected in a “keep list”. The size of the keep list k can be specified. Approximate solutions are controlled through two parameters, ε and k. The parameter ε signifies a minimal value for the hitting fraction, while k denotes the number of extra keep lists in use by the algorithm. Each reduct in the returned reduct set has a support count associated with it. The support count is a measure of the “strength” of the reduct, and might be evaluated by reducts selector. After conducting all fitness function evaluation iterations, the minimal subset of reducts and rules extracted from the data set with respect to the target object can be examined. This will be discussed in detail in the following section using an application relating to motherboard EMI fault diagnosis. Before using the GA-based search engine, one should initialize all of the necessary parameters for the GA operators, such as the length of a chromosome, population size, number of generations and the probabilities for crossover and mutation. The GA-based search engine then performs the genetic operations including reproduction, crossover, and mutation to obtain certain rules and possible rules from the certain training data set and possible training data set, respectively. The rules generator and selector perform two tasks: rule evaluation and rule selection. It examines all of the rules extracted from the GA-based search engine and employs Boolean algebraic operators, such as intersection and union, to simplify the rules. Redundant rules are removed during the rule selection operation. Related rules are clustered and generalized during simplification. The rule selection process is performed as follows: 1. Given a quality measure (see [29]), compute the quality of each rule in the rule set. Dump the qualities to a log file. 2. Let R = φ and t = ∞. 3. Lower t so that a certain minimum number of new rules have qualities above t. Add these rules to R. 4. Use R to classify the objects in a given decision table. 5. Dump t, |R|, and various performance information to a log file. 6. If R contains all rules, exit. Otherwise, go to Step 3.
5 Illustration 5.1 Case study description The high switching speed electronics to motherboard application means that noisy devices are being placed into a highly sensitive environment in which electromagnetic interference (EMI) can be a problem. Unintentional noise between the ground-reference plane of a daughter card (such as PS2, USB, VGA, and memory module, etc.) and ground-reference plane of a motherboard can cause significant emissions within a system. These emissions are caused directly by the EMI “noise” across the connector. The noise that this investigation intends to attenuate is due primarily to the various high-speed clocks in the motherboard. The
clocks for the different kinds of electrical devices connected to the motherboard produce difficulty in finding the source of the noise. Some of the noise generated by sources on the motherboard is conducted to the module. With the frequency of the electrical device modules running at, for example, F MHz, the primary emissions to watch are multiples of F. For example, with a 450 MHz microprocessor, the emissions to watch are 450 MHz, 900 Mhz, 1350 MHz, and so on. However, as will be evident from the measured data, the other modules radiate noises at other frequencies. These are noises generated somewhere other than the module and conducted to the motherboard. The module antenna structure then radiates this noise quite effectively. With emissions at different frequencies, two simple solutions are likely to be effective in this situation. The first is a shielding solution, with specific design objectives studied to provide very low impedance return path and contain any EMI noise within the module [3]. The other solution is a routing issue (layout issue) to work around the design constraints in a motherboard and maintain low cost and manufacturing impact. This work will focus on the second issue. It is critical to understand the constraints that define the solution space for the problem. In the case of high frequency electronic devices, the primary design constraints are the component placement on the motherboard. Another design area that can be potentially affected by the EMI solution is the routing that connects the processor bussing (data/address/control traces) to the other components on the motherboard, such as the chipset and other modules. However, the best approach is to define the EMI solution as early as possible, prior to beginning motherboard routing design. This allows the layout engineer to accommodate special requests for space on the motherboard. It is very important to design the solution considering all future module products that the targeted system might utilize. It is typical for a product to have upgrades, where the highest frequency could double between the initial product launch frequency and the frequency of the final product. It is not feasible to redesign the EMI solution, as it will very likely require motherboard modifications. Any EMI solution changes might require trace rerouting or component relocation. Either one is generally unacceptable. The peripherals are kept to a minimum to reduce the possible EMI source count within the system. Note that the focus of this investigation was to determine the emission source from the motherboard assembly. In a complex system like a server computer, numerous sources emit at the same frequencies. It is therefore imperative to identify the dominant sources. For example, with the CPU running at 450 MHz, emissions are possible at 450 MHz, 900 MHz, and other harmonics. If a component in the system has a strong emission at 900 MHz, it will become difficult to isolate this particular emission from the other components. It is therefore important to scan the system first to characterize the non-processor system emission signatures. Running the system at different frequencies can characterize the system emissions. The most significant signal on the board is the clock and that component must be active. These emissions are due to the processor. Any shifts in the emissions at other frequencies are in part due to the interaction between the processor,
124
the other components and conducted emissions. The conducted emissions are due to noise external to the processor, conducted to the module and then radiated off the processor. Most countries now require that imported products comply with the EMI laws of the USA, Germany, or the limits recommended by CISPR 22 (standard number 22 of the International Special Committee of Radio Interference). Legal EMI requirements focus only on regulating the noise “generated” by a product. However, products with frequencies between 30 MHz and 1GHz must have the EMI dB value measured. The above frequency is further separated into low frequency (30–230 MHz) and high frequency (230 MHz-1 GHz). The output should be below 30 dB and between 30 dB and 37 dB for low and high frequency, respectively. As mentioned before, if some peaks higher than the dB value set by the regulation exist, the EMI engineer must find effective means for eliminating the EMI emission sources. According to past experience, EMI noises are associated with the computer peripheral combination, the input signal frequency and the specific layout arrangement. Identifying these interactions and their relationships is a time-consuming and labor-extensive job that cannot be conducted via an experimental design or some other indirect tools. One of the feasible solutions is to use a novel, intelligent data-mining tool to explore the hidden knowledge in the historical test database on the developed products. Through the proposed approach, one can find the systematic diagnostic rules that for the R/D staff in dealing with EMI noise. The EMI test and fault diagnosis procedures are presented in Fig. 4.
Fig. 4. EMC approval flow chart
5.2 Creation the EMI diagnostic support system
5.4 Data preprocessing
We implemented a step-by-step procedure, shown in Fig. 3, to create the EMI diagnostic support system. The procedure involves data preprocessing, discretization, GA-based reducts, a reduct selector, rule generator and rule selector. An important aspect of the proposed approach is that the generated decisionrules can be implemented in the original diagnostic system with high classification accuracy. From an EMI diagnostic support perspective, we believe that the availability of clustering/classification knowledge – in terms of rules sets is highly desirable as it can provide interesting insights into the complex interaction relationship between the various large EMI data set attributes.
The data were submitted to further analysis, reorganized, merged, missing data deleted and coded into an integrated information table. Based on the historical experience for EMI diagnosis, the motherboard was separated into three distinct partitions, including area A (Printer, PS2 and Com), area B (VGA) and area C (LAN, USB, Audio and Game). As illustrated in Fig. 5, A, B and C will be the decision attribute values.
5.3 Data collection Historical EMI noise data, colleted from a famous motherboard manufacturing company in Taiwan, were selected for analysis. This data set was transformed into different attribute values and classified into distinct classes. Each data vector is comprised of eleven conditional attributes (A1∼A11) and a single decision variable (D12). Table 1 illustrates the names of the peripherals connected to the motherboard, the component specifications and the frequency data tested using a spectrum analyzer. The data set consists of 415 data vectors, including 134 high frequency data and 281 low frequency data, respectively.
5.5 Discretization The discretization operation conducted on the continuous-valued attributes might cause another inconsistent problem in rule induction. This is because different continuous values associated with an attribute in the original data set might be presented using the same discrete value. This problem, however, can be easily overcome using rough set theory, as outlined by Khoo et al. [26]. In this study, only one of the attributes illustrated in Table 1, (for instance, frequency level), is a continuous value. The continuous frequency values were properly discretized into 10 well-defined intervals based on past frequency point distribution experience. 5.6 GA-based attribute reduction A number of data reduction approaches [13, 14] have been widely applied in many fields. Our work takes advantage of
125 Table 1. Input attributes and decision attributes
Name
Description
Numbers of attribute value Low Frequency High Frequency
Input/Decision Full Band of Frequency attribute
Chipset IO Chipset Audio Memory PCB-size CPU-slot USB LAN VGA Failure-Level Frequency-Level Failure-Area
Brand I/O Chipset Audio chip brand Memory Spec. Motherboard size CPU slot Spec. USB Spec. LAN chip brand VGA position EMI Noise Level Frequency-Level EMI Failure-Area
3 7 3 2 2 5 2 4 4 3 8 3
3 7 3 2 2 5 2 4 4 3 3 3
3 3 2 2 2 3 2 4 3 3 5 3
Input A1 Input A2 Input A3 Input A4 Input A5 Input A6 Input A7 Input A8 Input A9 Input A10 Input A11 Decision D12 (Target)
5.7 Reducts selector
Fig. 5. Area A, B and C are the values of decision variable
the optimization capability of GAs. The fitness function design here is to determine those peripheral combinations that best reduce EMI noise. Thus, Eq. 6 can describe the fitness value of a chromosome. The parameter defines a weight between the subset costs and hitting fraction is set at 0.4 in the case of approximate solutions. Other genetic operating parameters are set as follows: Population size Length of Chromosome Size of keep list Selection operator Crossover operator Crossover probability Mutation probability Fitness parameter
70 22 256 Elitism Single point operator 0.3 0.05 Discernibility function
The reduct set for the coded information table is illustrated as follows: Two reduct sets for low frequency: Reduct 1 {Chipset, Audio, PCB-size, CPU-slot, LAN, VGA, Failure-Level, Frequency-Level} Reduct 2 {Chipset, Audio, Memory, PCB-size, CPU-slot, LAN, Failure-Level, Frequency-Level} Three reduct sets for high frequency G: Reduct 1 {Chipset, VGA, Failure-Level, Freq-Level} Reduct 2 {Chipset, Memory, LAN, Failure-Level, Freq-Level} Reduct 3 {Chipset, PCB-size, LAN, Failure-Level, Freq-Level}
Although all reducts are equivalent from the classification quality approximation viewpoint, some other criteria can be used to determine the most satisfactory reduct. According to the reduct definition, the coded information table can be reduced to the attributes from a chosen reduct without any loss in ability to approximate the decision classes. In this case study, after discussion with a field EMI manager, the most satisfactory reducts were found as illustrated below: Minimum number of input attributes for low frequency: {Chipset, Audio, Memory, PCB-size, CPU-slot, LAN, Failure-Level, Freq-Level} Minimum number of input attributes for high frequency: {Chipset, VGA, Failure-Level, Freq-Level} 5.8 Rule generator This step involves generating rules that satisfy a user-defined accuracy level. Based on the minimum number of input attributes for low frequency, the proposed approach will generate 166 rules. One of the rules is illustrated as follows: Chipset(Via) AND Audio(CMI9738) AND Memory(SDRAM) AND PCB-size(Micro-ATX) AND CPU-slot(Socket478) AND LAN(None) AND Fail-Level (C) AND Freq-Level(2) => Failarea (A). The generation rule is described as various peripheral combinations connected to a motherboard (such as Chipset, Audio, etc.) will result in different EMI frequency and failure levels. Based on the failure mode using the proposed approach, the potential failure will occur in area A (Printer, PS2 or Com). 5.9 Rule selector This step comprises a sequence of operations to filter out low quality rules from the rule-set generated in the previous step. We observed that, the rule-set usually contains a large number of distinct rules, thereby limiting the classification capabilities of the rule-set as some rules are redundant. Decision rules extracted
126 Table 2. Accuracy of classification for low frequency
Table 4. Results compared with decision tree approach
Decision value
A
C
B
Accuracy
A C B
72 23 0
3 179 1
0 0 3
96% 88.61% 75%
Accuracy
75.78%
97.81%
100%
90.39%
Table 3. Accuracy of classification for high frequency Decision value
A
C
B
Accuracy
A C B
46 18 4
12 42 4
0 0 8
79.31% 70.0% 50.0%
Accuracy
67.64%
72.41%
100%
71.64%
from a decision table can be used for recommendations concerning new cases. This approach employed the rules generated from the previous step to classify the data vector. The classification accuracy for low and high frequencies is presented in Tables 2 and 3. The average classification accuracy for low and high frequencies were 90.39% and 71.64%, respectively. 5.10 Creation the diagnostic support system A knowledge base was created to show the advantages of the proposed rough-set based diagnostic support system. When the EMI engineer tests the various peripheral combinations, the test results are input into the created diagnostic support system. The diagnostic support system then recommends a potential failure area for the engineer to perform further analysis. The predictive capability for low and high frequencies will be 90% and 72%, respectively. Because a new motherboard type or combination of peripherals will come, the knowledge base will be updated batch by batch. In doing so, the knowledge base will preserve up-to-date knowledge for the EMI diagnostic support system.
6 Comparison and conclusion A comparison with the decision tree induction C4.5 method was conducted to test the soundness of the proposed approach. Although the decision tree induction algorithm is substantially different from the GA-based rough set approach, they both try to solve the same problem. The decision tree induction approach is a non-parametric method such as the rough set approach and does not make any assumptions about data prior to the analysis. To compare these approaches on the same set of data, the number of low and high frequency attributes were different. Table 4 provides the results from the decision tree induction approach comparison. The classification accuracy for the low and high frequencies were higher than those obtained from the decision tree
Hybrid DT
Number of attributes (attribute reducts)
Low Frequency
High Frequency
Accuracy (average)
Low : 8 High : 4 Low : 11 High : 11
90.4% 85.4%
71.6% 68.7%
81% 77%
induction approach. This indicates the ability of the hybrid approach to respond efficiently to the classification problem. The proposed approach is therefore a reliable alternative to the decision tree induction approach. Rough set theory has been widely applied to meaningful knowledge extraction and predictions in the production and manufacturing fields. This paper summarizes a study leading to the establishment of a motherboard EMI fault diagnosis support system. The proposed hybrid system successfully integrated the rough set theory strengths in dealing with imprecise information and a robust search engine based on the GA approach. Using the historical data gleaned from a famous motherboard manufacturer in Taiwan, the proposed hybrid system was able to identify significant attributes and extract the decision rules for EMI fault diagnosis. The results show that the hybrid approach is a sound and pragmatic methodology for understanding the complexities of manufacturing data set and outperformed the decision tree induction algorithm. These rules were used to distinguish the fault area or predict the probability of EMI noise. Moreover, we believe that the creation of a diagnostic support system provides EMI engineers with a faster method for determining the root cause of EMI noise and reducing the R/D time span and quality cost.
References 1. Archambeault B, Connor S (2002) Measurements and simulations for ground-to-ground plane noise for SDRAM and DDR RAM daughter cards and motherboards for EMI emissions. Proceedings of the IEEE International Symposium on Electromagnetic Compatibility, Minneapolis, Minnesota, USA, 2002, pp 105–108 2. Morgan D (1994) A handbook for EMC testing and measurement (IEEE Electrical Measurement Series 8). Peter Peregrinus, London 3. Raza I (2000) Containing emissions from a microprocessor module. Proceedings of the IEEE International Symposium on Electromagnetic Compatibility, Washington, DC, USA, 2000, 2:871–876 4. Mills JP (1993) Electromagnetic interference. Prentice-Hall, New York 5. Lankford M, Davis K, Johnson RW, Baginski ME, Hontgas H, Slattery K, Newberry R, Evans J (1997) Investigation of design and performance of multichip modules including electromagnetic compatibility. Proceedings of the 6th International Conference on Multichip Modules, Denver, CO, USA, 1997, pp 187–192 6. Kusiak A (2001) Rough set theory: a data mining tool for semiconductor manufacturing. IEEE Trans Electron Packag Manuf 24(1):44–50 7. Han B, Wu TJ (2001) Data mining in multisensor system base on rough set theory. Proceedings of the American Control Conference, Arlington, VA, USA, 2001, 6:4427–4431 8. Han J, Micheline K (2001) Data mining: concepts and techniques. Kaufmann, San Francisco, CA 9. Gardner R, Bieker J (2000) Solving tough semiconductor manufacturing problems using data mining. IEEE/SEMI Advanced Semiconductor Manufacturing Conference, Boston, MA, USA, pp 46–55, 2000
127 10. Abidi SSR, Hoe KM (2002) Symbolic exposition of medical data-sets: a data mining workbench to inductively derive data-defining symbolic rules. Proceedings of the 15th IEEE Symposium on Computer-based Medical Systems, Maribor, Slovenia, 2002, pp 123–128 11. Bayardo RJ, Rakesh A, Dimitrios G (1999) Constraint-based rule mining in large, dense databases. Proceedings of the 15th International Conference on Data Engineering, Sydney, Australia, pp 188–197 12. Zhai LY, Khoo LP (2001) A prototype genetic algorithm-enhanced rough set-based rule induction system. Comput Ind 46(1):95–106 13. Ahn BS, Cho SS, Kim CY (2000) The integrated methodology of rough set theory and artificial neural network for business prediction. Expert Syst Appl 18(2):65–74 14. Zhai LY, Khoo LP, Fok SC (2002) Feature extraction using rough set theory and genetic algorithms – an application for the simplification of product quality evaluation. Comput Ind Eng 43(4):661–676 15. Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Boston 16. Pawlak Z (1982) Rough sets. Int J Inf Comput Sci 11(5):341–356 17. Pawlak Z (1984) Rough classification. Int J Man Mach Stud 20: 469–483 18. Pawlak Z (2002) Rough set approach to knowledge-based decision support. Eur J Oper Res 99(1):48–57 19. Pawlak Z (1999) Rough set theory for intelligent industrial application. Proceedings of the Second International Conference on Intelligent Processing and Manufacturing of Materials, Honululu, Hawaii, USA, 1:37–44 20. Shen L, Tay FEH, Qu LS, Shen Y (2000) Fault diagnosis using rough sets theory. Comput Ind 43(1):61–72
21. Wu Z (2001) Research on remote sensing image classification using neural network based on rough sets. Proceedings of 2001 International Conferences on Info-tech and Info-net, Oct 2001, Beijing, China, pp 279–284 22. Wang B, Chen S, Lin T (2002) Rough set based knowledge acquiring method in intelligent detecting system for lack of weld. Proceedings of the 4th World Congress on Intelligent Control and Automation, Shanghai, China, 2002, pp 2887–2891 23. Questier F, Arnaut-rollier I, Walczak B, Massart DL (2002) Application to rough set theory to feature selection for unsupervised clustering. Chemometrics Intell Lab Syst 63(2):155–167 24. Khoo LP, Tor SB, Zhai LY (1999) A rough-set based approach for classification and rule induction. Int J Adv Manuf Technol 15:438–444 25. Slowinski R, Stefanowski J (1994) Rough classification with valued closeness relation. In: Diday E, Lechevallier Y, Schader M, Bertrand P, Burtschy B (eds) New approaches in classification and data analysis. Springer, Berlin Heidelberg New York 26. Walczak B, Massart DL (1999) Rough sets theory. Chemometrics Intell Lab Syst 47(1):1–16 27. Norwegian University of Science and Technology (2001) ROSETTA version 1.4.41. http://www.idi.ntnu.no/∼aleks/rosetta/. Cited 5 Dec 2001 28. Vinterbo S, Øhrn A (2000) Minimal approximate hitting sets and rule templates. Int J Approximate Reasoning 25(2):123–143 29. Bruha I (1997) Quality of decision rules: definitions and classification schemes for multiple rules. In: Nakhaeizadeh G, Taylor CC (eds) Machine learning and statistics: the interface. Wiley, New York, pp 107–131