J Med Syst (2012) 36:79–91 DOI 10.1007/s10916-010-9448-5
ORIGINAL PAPER
An Improved Medical Decision Support System to Identify the Breast Cancer Using Mammogram Muthusamy Suganthi & Muthusamy Madheswaran
Received: 7 November 2009 / Accepted: 11 February 2010 / Published online: 10 March 2010 # Springer Science+Business Media, LLC 2010
Abstract An improved Computer Aided Clinical Decision Support System has been developed to classify the tumor and identify the stages of the cancer using neural network and presented in this paper. The texture and shape features have been extracted and the optimal feature set has been obtained using multiobjective genetic algorithm (MOGA). The multilayer back propagation neural network with Ant Colony Optimization and Particle Swarm Optimization has been used. The accuracy of the proposed system has been verified and found that the accuracy of 99.5% can be achieved. The proposed system can provide valuable information to the physicians in clinical pathology. Keywords Mammogram . Image denoising and enhancement . Feature extraction . Back propagation network . Breast tumor . Stages
Introduction Breast cancer is a leading cause of deaths among women in the recent past. Digital mammography has been found as one of the reliable imaging techniques for early detection of tumors and its characteristics. The developments of Computer Aided Diagnosis (CAD) systems have been focused by many researchers for providing valuable information to the radiologists. Early detection of breast cancer can play an important role in reducing the associated morbidity and mortality rates [1, 2]. Sheng-chih yang et al. [3] described the computer classification system having a M. Suganthi (*) : M. Madheswaran (*) Centre for Advanced Research, Electronics and Communication Engineering, Muthayammal Engineering College, Rasipuram 637408 Tamilnadu, India e-mail:
[email protected] e-mail:
[email protected]
probabilistic neural network (PNN) coupled with entropy thresholding techniques for mass extraction. Tulio C. S. S. et al. [4] have developed a system using single and multilayer neural network. A CAD system for distinguishing malignant from benign masses has been suggested by Rangaraj M. Rangayyan et al. [5]. It has been reported that the Artificial Neural Network (ANN) can distinguish malignant tumors and benign masses. Later, Mohamed A. Alolfe1 et al. [6] developed a Computer Aided Diagnosis (CAD) system to detect abnormalities in digital mammograms using automatic segmentation, feature extraction and classification techniques. A CAD system developed by Karen Drukker et al. [7] demonstrated the quantitative techniques to assess the features such as area, homogeneity, microcalcification of breast density from digitized mammograms using image processing and data mining concepts. Guodong Zhang and Hong Zhao [8] created a CAD system for detection and classification of Microcalcification (MCCs) or suspicious areas in digital mammograms that included the digitize module, detection module, feature extraction module, neural network module and classification module. Mohiy Hadhoud et al. [9] have also developed the computer-based system for the classification of breast tumor using hybrid algorithm of gray-level thresholding and dynamic programming. Giger.M et al. [10] investigated the possibility of creating a CAD system for detecting clustered microcalcifications from mammograms. The potential microcalcifications extracted using global thresholding based on the grey-level histogram of the full filtered image Huo Zhimin et al. [11] presented an automated method for differentiating malignant from benign masses. In their work, the extracted features were related to the margin and density of each mass from the neighborhoods of the computeridentified mass regions. Leonardo de Oliveira Martins et al. [12] used the computational tools to aid detection and diagnosis of breast masses. This tool has been used as second
80
J Med Syst (2012) 36:79–91
reader for medical image analysis. Keeping the above facts, the development of computer aided decision support system for classification of breast tumors to identify the stages of cancer has been developed and presented in this paper.
Fig. 1. The mammogram has been obtained and processed using various techniques. The features obtained are expected to provide valuable information to analyze the nature of the mammogram for further decision making in the clinical pathology.
Implementation of proposed system
Data acquisition and preprocessing
The functional block diagram of the proposed medical decision support system for classifying breast tumor as malignant and benign in the mammogram is shown in
The mammograms can be acquired with dedicated mammographic systems and digitized with a laser film scanner [Lumisys DIS -1000]. In the development of automated
Fig. 1 Flow graph of the proposed computer aided decision support system
Image Acquisition
Digitization
Image quality
Not Acceptable
Acceptable
Image Denoising and Enhancement
Segmentation
Feature Extraction
• • • • •
First order statistical feature Spatial gray level features Surrounding region dependence features Gary level-run length features Shape features
Optimal Feature Sub Set Selection Using Multiobjective Genetic Algorithm
Neural network classifier with optimization
Breast Tumor Classification
Feature subset
Decision Support System Benign
Malignant
Segmented output
Size of the tumor
Stages of Cancer
J Med Syst (2012) 36:79–91
81
mammographic classification system, the analysis of tumor detection depends on the regions of interest, which are usually of low contrast and noisy nature. Hence an image denoising and enhancement may be required to preserve the image quality, highlighting image features and suppressing the noise. Non-linear filters such as median filters have been earlier used for enhancing the tumor area in the image by Yoshida.H et al. [13]. The Region of Support (ROS) on the detection and enhancement of microcalcification were investigated. The median filter introduces noise into the transformed image. In the presence of noise depending on the shape of the transform window and noise levels the median filter can generate streaks and amorphous blocks. In order to overcome the above limitations, in the present work, the shock filter has been used for preprocessing. This filter is applied to remove the noisy fluctuations and also to enhance edges which contain semantically useful information. The shape, boundary and intensity variation of abnormal regions are found to be the most important features in breast tumor detection. It is therefore a meaningful multi-scale description is expected to meet the requirements so that it could be helpful in improving feature extraction and enhancing their discriminability of malignant from benign tumor [14, 15]. Consider a continuous image f: R2 →R: Then a class of filtered images {I(x, y, t) | t ≥ 0} of f(x, y) may be created by evolving f under the process It ¼ signðΔI ÞjrI j
ð1Þ
I ðx; y; 0Þ ¼ f ðx; yÞ
ð2Þ
where subscript t denotes partial derivatives, and ΔI ¼ Ixx þ Iyy is the Laplacian function used for edge detector T and rI ¼ Ix ; Iy is the (spatial) gradient of I. The initial condition in Eq. 2 ensures that the process starts at time t=0 with the original image f(x, y). And the Eq. 1 can be rewritten as It ¼ signðIhh ÞjrI j
ð3Þ
where η is the direction of the gradient.
However, this process is extremely sensitive to noise and thus has little practical use. In order to increase the performance, the edge detector coefficient Ihh is convolved with the Gaussian coefficient (Ga ) of low pass filter. The shock filter becomes It ¼ signðGs »Ihh ÞjrI j
ð4Þ
The objective of the AM shock filter is to de-noise and enhance the image. This is achieved by simultaneously lowering noise and enhancing the edges by adding diffusion operator with the shock filter [16]. This is given by It ¼ signðGs »Ihh ÞjrI j þ cI""
ð5Þ
where c is a positive constant and ε is the direction perpendicular to the gradient ∇I. This can be seen from the Fig. 2. Segmentation The segmentation is to distinguish one or more regions of interest (ROIs) from the selected image after pre-processing. The principle goal of segmentation is to partition an image into homogenous regions (spatially connected groups of pixels called classes, or subsets) with respect to one or more characteristics or features such that the union of any two neighboring regions yields heterogeneous features. Segmentation techniques can be classified into two main categories: edge-based segmentation techniques and region-based segmentation techniques. It has been found that the region-based segmentation performs well [10]. The region-based techniques such as Region growing, Watershed algorithm and Thresholding have been more commonly used algorithms. In this paper region based thresholding method is used for segmentation. Thresholding is obtained based on the image histogram or local statistics such as mean value, standard deviation and the local gradient. The tumors appear as bright regions in the mammogram, which are to be separated from the structured background. In order to identify the bright objects highlighted by the filter, the thresholding technique has been used after
Fig. 2 Output of the preprocessed image
Denoising and Edge enhancement by AM Shock Filter
(a) Original image
(b) Output of shock filter
82
J Med Syst (2012) 36:79–91
preprocessing. For a given image, the binarization can be done using the pixel intensity values and is given 1 if I ðx; yÞ T binðx; yÞ ¼ ð6Þ 0 if I ðx; yÞ < T where bin(x,y) is the resulting binary image and T is the threshold value. The threshold value can be estimated based on the mean of the pixel intensity (M) and the standard deviation (σ) for each region of interest. It is given by T ¼M þasþM k
ð7Þ
where α and k are the constants. The output of the segmentation is shown in Fig. 3. Extraction of multiple features and optimal feature selection The third stage of the proposed clinical support system focuses on the extraction of features and selection of optimal features for the classification. The texture and shape features are extracted from the segmented Region of Interest (ROI). The optimal features are selected using Genetic Algorithm technique. Texture feature extraction In the present work first order statistical features, the first order statistical features, spatial gray level dependent features, surrounding region dependent features, gray level run length feature and gray level difference feature have been considered for diagnosis [17, 18]. First order statistical features (FOSF) These features provide different statistical properties of an image. In this study the features such as Mean, Dispersion, Variance,
Average Energy , Skewness , Kurtosis, Median, and Mode are considered [19]. Spatial gray level dependent features (SGLDF) These features are estimated based on the second order joint density function [20, 21]. p(i, j|d, θ) for θ=0°, 45°, 90° and 135°. The function p(i, j|d, θ) is the probability of two pixels which are located with an intersample distance d and direction θ, have gray level i and j. The estimated joint conditional probability density functions are defined as pði; jÞjd; 0 Þ ¼ # ððk; l Þ; ðm; nÞÞ 2 Lx Ly Lx Ly : k ¼ m jl nj ¼ d; S ðk; l Þ ¼ i; S ðm; nÞ ¼ jg=N ðd; 0 Þ
ð8Þ
pði; jjd;45 Þ ¼ # ðk; l Þ; ðm; nÞÞ 2 Lx Ly Lx Ly : ðk m ¼ d; l n ¼ d Þ or ðk m ¼ d; l n ¼ d Þ S ðk; l Þ ¼ i; S ðm; nÞ ¼ jg=N ðd; 45 Þ
ð9Þ
pði; jjd;90Þ ¼ # ððk; l Þ; ðm; nÞÞ 2 Lx Ly Lx Ly : jk mj ¼ d; l ¼ nÞ; S ðk; l Þ ¼ i; S ðm; nÞ ¼ jg=N ðd; 90 Þ
ð10Þ
pði; jjd;135 Þ ¼ # ððk; l Þ; ðm; nÞÞ 2 Lx Ly Lx Ly : jk mj ¼ d; l n ¼ dÞ; S ðk; l Þ ¼ i; S ðm; nÞ ¼ jg=N ðd; 135 Þ
ð11Þ
where # denotes the number of elements in the set, S(x, y) is the image intensity at the point (k, l) and N(d, θ) stands for the total number of pixel pairs within the image which have the intersample distance d and direction θ, Lx and Ly are the length of the ROI in x and y directions. The features include Angular Second Moment, Contrast, Correlation, Variance, Inverse Difference Moment, Sum Average, Sum Entropy, Entropy , Difference Variance, Difference Entropy, Information of correlation-I , Information Correlation-II and Maximum Correlation Coefficient can be extracted from the Eqs. 8–11. Surrounding region dependent features (SRDF) [21] Similarly the surrounding dependent features can be obtained from the second order histogram of the surrounding regions. An ROI image is transformed into a surrounding regiondependence matrix, which can be defined as
Fig. 3 Segmented output
M ðqÞ ¼ ½aði; jÞ; 0 i j n
ð12Þ
J Med Syst (2012) 36:79–91
83
where M(q) is the surrounding region dependence features, q is a chosen threshold value and α(i, j) is number of pixels with gray level i,j and is given by aði; jÞ ¼ #fðx; yÞjcR1 ðx; yÞ ¼ i and cR2 ðx; yÞ ¼ j; ðx; yÞ 2 Lx Ly
ð13Þ
The inner count cR1(x, y) and the outer count cR2(x, y) on the current pixel (x,y) are defined as, cR1 ðx; yÞ ¼ #fðk; l Þjðk; l Þ 2 R1 and ½S ðx; yÞ S ðk; l Þ qg
ð14Þ
cR2 ðx; yÞ ¼ #fðk; l Þjðk; l Þ 2 R2 and ½S ðx; yÞ S ðk; l Þ qg
ð15Þ
where S(x,y) is the image intensity of the current pixel (x,y). The optimal selection of the q will predict the performance of classification. From the characteristics of the element distribution in the surrounding region-dependence features, the textural features such as Horizontal-Weighted Sum, Vertical-Weighted Sum, Diagonal-Weighted Sum and Grid-Weighted Sum can be estimated. Gray level run length feature (GLRLF) [20] These features are estimated based on the number of gray-level runs with various lengths. The length of the run is the total number of pixel points in the run and the gray-level run-length feature is estimated using
Rðq Þ ¼ ½g ði; jÞjq; 0 i Ng ; 0 j Rmax
ð16Þ
where Ng is the maximum gray-level and Rmax is the maximum run length which is equal to max {Lx, Ly}. The element g(i, j|θ) specifies the estimated number of times that a given picture contains a run length j for a gray level i in the direction of the angle θ. The textural features such as Short-run emphasis, Long-run emphasis, Gray-level nonuniformity, Run-length nonuniformity and Run percentage can be measured from R(θ).
Shape features [22] The shape of the tumor provides the valuable features to classify the tumor as malignant and benign. The shape feature includes the geometric parameters such as area, perimeter, circularity, radial distance mean and standard deviation, area ratio, orientation, eccentricity and moment invariants. It was reported that these shape features can be calculated from the segmented ROI [23]. In this work only nine important shape features are considered for classification. The radial distance d(i) is given as qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dðiÞ ¼ ðxi X0 Þ2 þ ðyi Y0 Þ2 ; i ¼ 1; 2 . . . . . . ; N
where (X0,Y0) are the coordinates of the centroid, xi and yi are the coordinates of the boundary pixel at the ith location and N is the number of boundary pixels in the extracted region. The tumor circularity C, is defined as C¼
P2 A
davg ¼
N 1 X dðiÞ N i¼1
ð20Þ
The standard deviation of the radial distance can be computed from vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u1 X 2 s¼t dðiÞ davg N i¼1
ð17Þ
The features such as Contrast, Angular second moment, Entropy, Mean and inverse difference moment can be measured from D(i|δ).
ð19Þ
where P is the perimeter and A is the area of the tumor. The perimeter can be measured by summing the number of pixels on the border of the mass and the number of pixels inside the border. The mean radial distance is represented as
Gray level difference features (GLDF) [20] The gray level difference features are estimated based on the occurrence of two pixels which have a given absolute difference in gray level and which are separated by a specific displacement δ. For any given displacement vector δ = (Δx, Δv), let Sd ðx; yÞ ¼ jS ðx; yÞ S ðx þ Δx; y þ ΔyÞj and D(i|δ) can be estimated using probability-density function defined by DðijdÞ ¼ P½Sd ðx; yÞ ¼ i
ð18Þ
Fig. 4 Back propagation network
ð21Þ
84
J Med Syst (2012) 36:79–91
Table 1 Performance of various filtering techniques S.No
Filtering Technique
1. 2. 3. 4.
SNR (dB)
Original image Median Filtration Osher-Rudin Filtration Alvarez-Mazorra(AM) Filtration
MSE
Gaussian with 0.2σ
Gaussian with 0.4σ
Gaussian with 0.2σ
Gaussian with 0.4σ
17.275 18.458 19.376 23.584
16.695 17.396 18.940 22.120
789.659 773.126 780.782 665.381
811.012 799.563 810.176 680.054
The area ratio parameter is defined as A¼
1
davg N
N P
dðiÞ davg
ð22Þ
i¼1
where A ¼ 0 8 dðiÞ davg The Roughness is calculated for each segment using RðjÞ ¼
Lþj X i¼j
N jdðiÞ d ði þ 1Þj; where j ¼ 1; 2; . . . . . . L
algorithm (MOGA) technique [24, 25]. The outcome of the optimization is expected to provide the optimal set of features which can be used as input to the classifier. The optimization algorithm is expected to minimize the number of redundant features and minimize the error rate. Minimization of M objectives can be stated as Minimize
ð23Þ
f ðxÞ ¼ ½fi ðxÞ
; i ¼ 1; 2; . . . . . . . . . . . . . . . M
ð29Þ
and N X L L RðjÞ R¼ N j¼1
ð24Þ
where R(j) is the roughness index for the jth segment, L is the number of boundary points in the segment and N is the total number of boundary points. The eccentricity characterizes the lengthiness of a ROI. The eccentricity can be estimated as follows. A symmetric matrix A is defined as A11 ¼
N X
ðxi X0 Þ2 ; A22 ðyi Y0 Þ2
ð25Þ
i¼1
A12 ¼ A21 ¼
N X
ð x i X 0 Þ ð y i Y0 Þ
ð26Þ
i¼1
λ1 and λ2 are the Eigen value of the A matrix of the ROI q ffiffiffiffiffiffiffi
q ffiffiffiffiffiffiffi
ð27Þ S1 ¼ l21 S2 ¼ l22
Then the eccentricity is given by eccentricity ¼
S1 S2
ð28Þ
Optimal feature selection using multiobjective genetic algorithm The numbers of texture and shape features extracted are more in numbers and the optimization of feature set has become necessary. This can be achieved using multi objective genetic
Fig. 5 a Original image (Malignant sample). b Filtered image. c Boundary selection of filtered image. d Segmented image
J Med Syst (2012) 36:79–91
85
Fig. 6 a Original image(Benign sample). b Filtered image. c Boundary selection of filtered image. d Segmented image
Subject to gi ðxÞ 0
j ¼ 1; 2; . . . . . . . . . . . . J
hk ðxÞ ¼ 0 k ¼ 1; 2; . . . . . . . . . K
ð30Þ
be used to generate the optimal set, which aggregates the objectives into a single and parameterized objective. The aggregation can be performed through a linear combination of the objectives
ð31Þ
where fi(x) is the objective function, gi(x) is the ith inequality constraint, and hk(x) is the kth equality constraint such that f(x) is the optimized set. The multiple features set can be considered as population and each individual in the population represents a candidate solution to the feature subset selection problem. If m were the total number of features available to choose to represent from the features to be classified, the individual is represented by a binary vector of dimension m. If a bit is 1, it means that the corresponding feature is extracted, otherwise feature is not selected. Usually, there is a fitness value associated with each chromosome, for example, in a minimization problem, a lower fitness value means that the chromosome or solution is more optimized to the problem while, a higher value of fitness indicates a less optimized chromosome. The weighting method suggested by KS. Bandyopadhyay et al [20] and Kaushik Roy et al [24] can
ObjðyÞ ¼
n X
Obji ðyÞwi
ð32Þ
i¼1
where wi denotes the weights and it is normalized to ∑wi= 1 without losing the generality. The optimum value can be found for each individual feature set and compared among other features in the same group to select the set for classification. Similarly for every group an optimal feature can be selected. Classification of tumor The back propagation-learning algorithm [26, 27] is widely used for multi-layer feed forward network. The Multilayer back propagation neural (MBPN) network shown in Fig. 4 can be considered for classification for tumor. The optimal features obtained can be used to classify the tumor into benign and malignant. Each layer has its own
86
J Med Syst (2012) 36:79–91
Table 2 Quantitative validation of texture and shape features extracted from segmented Region of Interest (ROI) S.No
Features
Estimated values using Student t-test Benign
Malignant
First Order Statistical Features (FOSF)
0.0734346
0.0999353
1
Mean
0.0538117
0.0635457
2
Dispersion
−0.0300008
−0.0375544
3
Variance
0.00057006
−0.2191211
4
Average Energy
−0.0580034
0.0006808
5
Skewness
0.0448537
−0.0710028
6
Kurtosis
0.0449847
0.0516214
7
Median
−0.0558371
−0.1120594
8
Mode
−0.3007779
−0.1121958
Spatial Gray Level Dependent Features (SGLDF) 9
Angular Second Moment
0.0791432
−0.0171463
10
Contrast
0.0321140
0.1121957
11
Correlation
−0.0911457
−0.6155358
12
Variance
−0.0695627
−0.1296119
13
Inverse Difference Moment
0.3689451
0.2099611
14
Sum Average
−0.2099688
−0.3211622
15
Sum Entropy
0.0897148
0.0484182
16
Entropy
0.3652147
0.0581911
17
Difference Variance
0.0965201
0.0954231
18
Difference Entropy
−0.698879
−0.6478913
19
Information of correlation-I
0.9987184
0.3145877
20
Information Correlation-II
0.32140198
0.1154679
21
Maximum Correlation Coefficient
0.3698750
0.6214578
Surrounding Region Dependent Features (SRDF) 22
Horizontal-Weighted Sum
0.5489754
0.5478969
23
Vertical-Weighted Sum
0.3985647
0.8794562
24
Diagonal-Weighted Sum
0.6987546
0.9995645
25
Grid-Weighted Sum
−0.3215745
−0.4214567
Gray Level Run Length Feature (GLRLF) 26
Short-run emphasis
0.1247964
0.2154789
27
Long-run emphasis
0.2136547
0.6211459
28
Gray-level nonuniformity
0.8954761
0.3116548
29
Run-length nonuniformity
−0.1158746
−0.7894561
30
Run percentage Gray Level Difference Features (GLDF)
31
Contrast
−0.8765489
−0.2136540
32
Angular second moment
0.2541698
0.0251306
33
Entropy
−0.3214567
−0.9464201
34
Mean
0.0014725
0.0023231
35
Inverse difference moment
0.2136498
0.1654251
Shape features 36
Area
0.1147202
0.1123118
37
Perimeter
0.1123659
0.3698712
38
Circularity
0.0112011
0.2100122
39
Radial distance mean and standard deviation
0.1101213
0.3659812
40
Area ratio
0.3210126
0.5897764
41
Orientation
−0.0012136
0.6521012
42
Eccentricity
−0.6321981
0.8741200
43
Moment Invariants
0.2200113
0.0320123
J Med Syst (2012) 36:79–91
87
Feature set
Sensitivity (%)
Specificity (%)
Overall Accuracy
89 90 96
67 87 94
96.2% 96.9% 99.5%
Shape features Texture features Multiple features (Shape + Texture)
number of neurons and the number of neurons in the input layer is equal to the number of input features. The hidden layer has the number of neurons equal to input neurons where as the output layer has only one neuron. The output at the hidden and output layer is calculated using sigmoid functions. MBPN-ant colony optimization with particle swarm optimization classifier
1 ð1 þ elxHL Þ
ð33Þ
where λ is the scaling factor assigned with 1; xHL ¼ P wih ki i ¼ 1; 2; . . . ; n. Here n is the number of input i neurons The output at the output layer, Sig2 is calculated using 1 ð34Þ ð1 þ elxOL Þ P xOL ¼ who si ; i ¼ 1; 2; . . . ; n, n is the number of hidden neuronsi and si is the output from the hidden layer. Sig2 ¼
Table 4 Details of multiplayer back propagation network parameter
1 0.9 0.8 0.7
With only Shape features
0.6
With only texture features
0.5 0.4
Classificaiton using MBPN
0.3 0.2
MBPN with ACO and PSO
0.1 0 0.000
0.200
0.400
0.600
0.800
1.000
1.200
False Positive Fraction(1-Specificity)
Fig. 7 ROC analysis for tumor classification system
MBPN training and testing
Let wih is the weights between the input and the hidden layer who is the weights between the hidden and the output layer, Sig1 be the Sigmoid function to calculate the output at the Hidden layer and Sig2 is the Sigmoid function to calculate the output at the Output layer. The output at the hidden layer, Sig1 is calculated using the sigmoid function Sig1 ¼
ROC Analysis True Positive Fraction(Sensitivity)
Table 3 Performance comparison of single and Multiple Feature subset selection
In the training phase, P numbers of images are selected from the database and their texture and shape features extracted. These features are optimized and fed to the MBPN and the weights are updated till the MBPN produces the output as <0.5 for Benign and >0.5 for malignant cases. Once the training phase is completed, the remaining images in the data set can be tested with the updated weights. In case of online mode, the selected feature from each individual training image features are fed to the MBPN and the output is estimated. Then the error between the actual output and the target output is calculated. If the error is greater than the tolerance value then the weights are updated and repeated till the error is less than the tolerance value. In case of Batch Mode, consider P number of images for training. The first image features are fed to the MBPN and the output and error value is calculated. This is repeated for all the images and the error values are calculated. The Mean Square Error (MSE) value is calculated from the error values of all the images in the training set. If the MSE value
S.No
Functions used for MBPN
Design Parameters
1 2 3 4 5
Learning Rate (η) Momentum (β) Threshold value (T) Activation Hidden layer Number of hidden units Input neurons Output neuron Maximum Mean Square error (δmax) Number of iterations
0.2 0.001 MBPN <0.5 for Benign; 0.5 >Malignant LOGSIG at the hidden layer and output layer 1 30 12 1 0.001 50000
6 7 8 9
88
J Med Syst (2012) 36:79–91
is greater than the tolerance value then the weights are updated and the procedure is repeated again. Otherwise, training phase is completed. In the proposed method, batch
mode has been used for training the MBPN and the weights are updated using called Ant Colony Optimization (ACO) with Particle Swarm Optimization (PSO) techniques [28].
………………………………………………………………………………Pseudocode Pseudo code for the ACO based PSO is given below initialize ACO_Administrator for (candidate topology i=1…N) create PSO_Teacher(i) for (MBPN_Student j=1…M) initialize MBPN_Student end for end for while(solution not found) compute ant movements ants allocate training iterations for (PSO_Teacher i=1…N) while (iterations < allocation) for(MBPN_Student j=1…M) test MBPN_Student(j) end for for (MBPN_Student j=1…M) update weights MBPN_Student(j) end for end while return global best end for update pheromone concentrations end while end ……………………………………………………………………………………………
Estimation of tumor growth The various stages of cancer can be differentiated by the growth of tumor size in terms of its number of cells (mass) per period [29]. The vector indication of the tumor stages is ranged from stage 0 to stage 4. Stage 0:
Stage 1:
This stage describes non-invasive breast cancer, which represents no evidence of cancer cells or non-cancerous abnormal cells breaking out of the part of the breast in which they started, or of getting through to or invading neighboring normal tissue. This stage represents the tumor at the initial growth where the cell multiplication rate is
slower. The tumor measures less than 2 cm / 1 in, or the lymph nodes in the armpit are affected, or both. However, there are no signs that the cancer has spread further. Stage 2: The tumor measures between 2 and 6 m/1–2 in, or the lymph nodes in the armpit are affected, or both. Here the growth of mass of the tumor is quite vigorous than the former stage and the cell multiplication are at a rapid progress of cell. In this stage is considered as warning stage. Stage 3: The tumor is larger than 6 cm/2 in and may be attached to surrounding structures such as the muscle or skin. The stage is very infective and the growth rate of the tumor is very rapid which has more vigorous DNA cells.
J Med Syst (2012) 36:79–91
89
Table 5 Performance of the classifier Classes
No. of data No. of correctly for training /testing Without optimal feature subset selection
Benign 90/85 Malignant 90/85 Average
classified data for MBPN
Percentage of correct classification
With optimal With optimal feature Without With optimal With optimal feature feature subset subset selection using optimal feature feature subset subset selection using selection Ant Colony optimization subset selection selection Ant Colony optimization
78 77
82 81
84 83
Stage 4: The tumor is of any size, but the lymph nodes are usually affected and the cancer has spread to other parts of the body. The stage is about the mortal growth of the tumor and its vigorous seems to be entered into the saturation level. But its crucial activity is highly infected one.
91.74 90.05 93.55
96.4 95.29 95.845
99 99 99
The texture and shape features were estimated and fed to the MOGA for optimizing the features sets. In the proposed system, forty-three features were extracted and used for obtaining optimized feature set. The optimal feature set with 15 features was obtained using MOGA. The quantitative measures of the derived 43 features for the sample images are provided in Table 2. Medical expert validation
Results and discussion The images from Digital Database for Screening Mammography (DDSM) have been considered for developing clinical decision support system. The proposed system has been simulated and the features were also extracted for further analysis using MATLAB. The sample image has been preprocessed using AM shock filter. The performance of the filter is compared with other filters and shown in Table 1 .It is seen that the AM shock filter performs well. The sample image considered for malignant is shown in Fig. 5(a).The output of the filter, edge detection and segmentation is shown in Fig. 5(b), (c) and (d) respectively. Similarly the sample benign image and respective outputs after preprocessing and segmentation are shown in Fig. 6. In both the cases the threshold T has been calculated using Eq. 7 considering mean (M) and the standard deviation θ as 0.991 and 0.59 respectively.
The resultant decision on category is validated with the expert and based on the interpretation of the expert, the actual efficiency is estimated. The performance of the proposed method has been evaluated in terms of sensitivity and specificity. Sensitivity (True Positive Fraction) is the probability that a diagnostic test is positive, given that the person has the disease. Specificity (True Negative Fraction) is the probability that a diagnostic test is negative, given that the person does not have the disease [26]. Overall accuracy is the probability that a diagnostic test is correctly performed. The three indices are defined as Sensitivity ¼
TP TP þ TN
ð35Þ
Specificity ¼
TN TP þ TN
ð36Þ
Table 6 Performance evaluation of classifier by means of area under receiver operating characteristics (ROC) curve (Az) corresponding Standard Error (SE) and execution time Classification category
Az
SE
Time (ms)
Classification using only shape features Classification using only texture features Classification using MBPN
0.85 0.89 0.91
0.10 0.09 0.03
681.6 304.1 8.75
Classification Using MBPN with Ant Colony Optimization and Particle Swarm Optimization
0.99
0.02
0.225
Fig. 8 Growth of tumor through various stages
90
Accuracy ¼
J Med Syst (2012) 36:79–91
TP þ FN TP þ TN þ FP þ FN
ð37Þ
where TP represents True Positive and TN represent True Negative. The performance of various characteristics has been estimated considering the features such as shape and texture as well as multiple features obtained by combining both types of features. It has been seen in Table 3, that the over all accuracy is 99.5% for multiple feature set. The selected optimal features have been given as input to the multilayer back propagation neural network for classification. The parameters considered for multilayer back propagation neural network is given in Table 4. It has been noticed that the proposed system gives better sensitivity, specificity and overall accuracy. The Receiver Operating Characteristic (ROC) is a plot of sensitivity against (1—specificity). The selected optimal features have been given as input to the multilayer back propagation neural network for classification. The area under the ROC (Az) curve is an important parameter to determine the overall classification accuracy of the proposed system. Figure 7 shows the comparison of the ROC curve for the various classification methods. It has been seen that, Az is high when the image is pre-processed and the classifier with optimal feature subsets using Ant Colony Optimization considering Particle Swarm Optimization weight updation method. The performance of the proposed classification system after training and testing is given in Table 5. The execution time has been found to decrease while considering optimal feature subset. From Table 6, it can be observed that the classification using MBPN with Ant Colony Optimization and Particle Swarm Optimization has the largest area (0.99) under the curve whereas other methods give lesser value. Hence, proposed method provides a higher accuracy than other methods. Once the system identifies the tumor as malignant or benign, if the tumor is identified as malignant, the algorithm goes back to the segmented output. Then by using radial distance mean the size of the tumor is calculated. which is used to identify the stages. From the Fig. 8 it is seen that, using the size of the tumor, the various stages are measured. The size of the tumor is less than 2 cm it is considered as a initial stage (stage 1).The size of the tumor is in between 2–6 cm is considered as caution (stage 2). The size of the tumor is in between 6–8 cm is considered as dangerous (stage 2).If the size of the tumor increases above 8 cm is treated as emergency(stage 4).The size of the tumor in Fig. 2 is 4.5 cm. So it is identified as a stage 2 cancer and it is treated as caution stage.
Conclusion The computer aided diagnosis system to identify the stages of cancer using neural network based algorithm has been developed and validated with various samples. It is concluded from the analysis that the multiple features and the selection of optimal set enhance the classification of tumour as benign or malignant. Further it is seen that the stages of the breast cancer can also be predicted using the proposed algorithm.
References 1. Cheng, H. D., Shi, X. J., Min, R., Hu, L. M., Cai, X. P., and Du, H. N., Approaches for automated detection and classification of masses in mammograms. Pattern Recogn. 39:646–668, 2006. 2. Rangayyan, R. M., Xu, J., Elnaqa, I., and Yang, Y., Jr., Computeraided detection and diagnosis of breast cancer with mammography: recent advances. IEEE Trans. Inf. Technol. Biomed. 13 (2):236–251, 2009. 3. Yang, S.-C., Wang, C.-M., Chung, Y.-N., Hsu, G.-C., Lee, S.-K., Chung, P.-C., and Chang, C., A computer-aided system for mass detection and classification in digitized mammograms. Biomed Eng. Appl. Basis Commun. 17:215–228, 2005. 4. Tulio, C. S. S. A., and Rangayyan, R. M. Classification of breast masses in mammograms using neural networks with shape, edge sharpness, and texture features. J Electron Imaging 15(1), 2006. 5. Rangayyan, R. M., da Silva, L. A., and Del Moral Hernandez, E. Classification of breast masses using a committee machine of artificial neural networks. J. Electro. Imaging 17(1), 2008. 6. Alolfe, M. A., Youssef, A.-B. M., Kadah, Y. M., and Mohamed, A. S.. Development of a computer aided classification system for Cancer detection from digital mammograms. 25th National Radio Science Conference, Faculty of Engineering, Tanta Univ., Egypt, 2008. 7. Drukker, K., Sennett, C. A., and Giger, M. L., Automated method for improving system performance of computer-aided diagnosis in breast ultrasound. IEEE Trans. Med. Imaging 28(1):122–128, 2009. 8. Zhang, G., and Zhao, H.. A CAD system in mammography using ANN. International conference on biomedical engineering and informatics, 2008. 9. Hadhoud, M., Amin, M., and Dabbour, W.. Detection of breast cancer tumor algorithm using mathematical morphology and wavelet analysis. GVIP 05 Conference, CICC, Cairo, Egypt, 2005. 10. Giger, M. L., Nishikawa, R. M., Doi, K., Vyborny, C. J., and Schmidt, R. A., Computer-aided detection of clustered microcalcifications on digital mammograms. Med. Biol. Eng. Comput. 33:174–178, 1995. 11. Huo, Z., Giger, M. L., Vyborny, C. J., Wolverton, D. E., Schmidt, R. A., and Doi, K., Automated computerized classification of malignant and benign mass lesions on digitized mammograms. Acad. Radiol. 15:155–168, 1998. 12. de Oliveira Martins, L., Silva, A. C., de Paiva, A. C., and Gattass, M., Detection of breast masses in mammogram images using growing neural gas algorithm and Ripley’s K function. J. Sign. Process. Syst., Springer 55:77–90, 2009. 13. Yoshida, H., Doi, K., Nishikawa, R., Giger, M., and Schmidt, R., An improved computer-assisted diagnostic scheme using wavelet
J Med Syst (2012) 36:79–91
14.
15.
16.
17.
18. 19.
20.
21.
transform for detecting clustered microcalcifications in digital mammograms. Acad. Radiol. 3:621–627, 1996. Bettahar, S., and Stambouli, A. B. Shock filter coupled to curvature diffusion for image denoising and Sharpening. Journal of image vision and computing, Vol. 26. Elsevier, pp. 1481–1489, 2008. Suganthi, M., and .Madheswaran. Mammogram image enhancement and denoising using shock filters. Proceedings of International Conference on Advanced Communication and Informatics, TPGIT, Vellor,India, 2009. Alvarez, L., and Mazorra, L., Signal and image restoration using shock filters and anisotropic diffusion. SIAM J. Num. Anal. 31 (2):590–605, 1994. Karahaliou, A., Skiadopoulos, S., Boniatis, I., Sakellaropoulous, P., Likaki, E., Panayiotakis, G., and Costaridou, L., Texture analysis of tissure sorrounding microcalcification on mammograms for breast cancer diagnosis. Br. J. Radiol. 80:648–656, 2007. Haralick, R. M., Stastistical and structural approaches to texture. Proc. IEEE 67(5):786–804, 1979. Jong Kook Kim, K., and Park, H. W., Statistical textural features for detection of microcalcifications in digitized mammograms. IEEE Trans. Med. Imaging 18(3):231–238, 1999. Christodoulou, C. I., Pattichis, C. S., Pantziaris, M., and Nicolaides, A., Texture-based classification of atherosclerotic carotid plaques. IEEE Trans. Med. Imaging 22(7):902–911, 2003. Raja, K. B., Madheswaran, M., and Thyagarajah, K., A hybrid fuzzy-neural system for computer-aided diagnosis of ultrasound
91
22.
23.
24.
25.
26.
27.
28.
29.
kidney images using prominent features. J. Med. Syst. 32:65–83, 2008. Shen, C. L., Rangayyan, R. M., and Desautels, J. E. L., Application of shape analysis to mammographic calcifications. IEEE Trans. Med. Imaging 13:263–274, 1994. Rangayyan, R. M., Mudigonda, N. R., and Desautels, J. E. L., Boundary modeling and shape analysis methods for classification of mammographic masses. Med. Biol. Eng. Comput. 38:487–496, 2000. Bandyopadhyay, K. S., Pal, S. K., and Aruna, B., Multiobjective GAs, quantitative indices, and pattern classification. IEEE Trans. Syst. Man Cybern. B 34(5):2088–2099, 2004. Roy, K., and Bhattacharya, P. Optimal features subset selection and classification for Iris recognition. J. Image Video Process. 1, Article ID 743103, pp. 20, doi:10.1155/2008/743103,2008. Rajendra Achrya, U., Ng, E. Y. K., Chang, Y. H., Yang, J., and Kaw, G. J. L., Computer –based identification of breast cancer using digitized mammograms. J. Med. Syst. 38:499–507, 2008. Moein, S., Monadjemi, S. A., and Moallem, P. A novel fuzzyneural based medical diagnosis system. Proceedings Of World Academy Of Science, Engineering And Technology, Vol. 27, ISSN 1307–6884, 2008. Vlachogiannis, J. G., Hatziargyriou, N. D., and Lee, K. Y., Ant colony system-based algorithm for constrained load flow problem. IEEE Trans. Power. Syst. 20(3):1249–1250, 2005. Alarcon, T., Byrne, H. M., and Maini, P. K., A multiple scale model for tumor growth. Multistage Model Simul. 3(2):440–475, 2005.