2006 by the Socie´te´ Internationale de Chirurgie Published Online: 9 November 2006
World J Surg (2006) 30: 2266–2269 DOI: 10.1007/s00268-005-0675-8
Evaluation of P-POSSUM and CR-POSSUM Scores in Patients with Colorectal Cancer Undergoing Resection ¨ mer Yoldas¸ , MD,1 Erdal Gocmen, MD,1 Bahadır Ku¨lah, MD,2 Mesut Tez, MD,1 O Mahmut Koc, MD1 1
Fifth Department of Surgery, Ankara Numune Education and Research Hospital, Ankara 06100 , Turkey Third Department of Surgery, Ankara Numune Education and Research Hospital, Ankara 06100 , Turkey
2
Abstract Introduction: The aim of this study was to evaluate the predictive accuracy of P-POSSUM and CRPOSSUM models on patients undergoing colorectal resection. Methods: P-POSSUM and CR-POSSUM predictor equations for mortality were applied retrospectively to 321 patients who had undergone colorectal resection for cancer. P-POSSUM and CR-POSSUM scores were validated by assessing their calibration and discrimination. Calibration was assessed using the Hosmer-Lemeshow goodness-of-fit test and the corresponding calibration curves. Evaluation of the discriminative capability of both models was performed using receiveroperating characteristic (ROC) curve analysis. Results: Overall, 22 deaths were observed. CR-POSSUM predicted 25 deaths (v2 = 12.20, P = 0.13), and P-POSSUM predicted 29 deaths (v2 =18.85, P = 0.002). ROC curves analysis revealed that CR-POSSUM has reasonable discriminatory power for mortality. Conclusions: These data suggest that CR-POSSUM may provide a better estimate of the risk of mortality for patients who undergoing colorectal resection.
W
orldwide, colorectal cancers rank third in frequency of all cancers in men and second in women. Colorectal cancer is the fourth leading cause of cancer mortality because it has a better prognosis than more common cancers.1 After colorectal carcinoma has been diagnosed, the pretreatment assessment is conducted to determine the most appropriate form of treatment. Whereas surgical management is required for most patients with a diagnosis of colorectal carcinoma, the appropriateness of surgical resection is determined by the extent of disease and co-morbidities.2 Therefore, prognostic scoring is important in these patients. A simple scoring system known as POSSUM (Physiological and Correspondence to: Mesut Tez, MD, 37.sok 19/6, 06500, Ankara Turkey, e-mail:
[email protected]
Operative Severity Score for the EnUmeration of Mortality and Morbidity) and has been developed to allow riskadjusted audit of surgical morbidity and mortality in general surgery.3 The POSSUM scoring system is a linear model that uses logistic regression analysis on a general surgical workload. It encompasses a 12-factor physiological score and a 6-factor operative severity score. It has been validated based on more than 15,000 patients. POSSUM has become a well established scoring system and has been used to predict outcome following vascular, upper gastrointestinal, pulmonary, and colorectal surgeries.4 The POSSUM score was found to overpredict mortality for vascular and general surgical procedures; thus a new version, the Portsmouth Predictor Equation for Mortality (P-POSSUM) was developed.4 However, P-POSSUM has been reported to overpredict mortality
Tez et al.: Evaluation of P-POSSUM and CR-POSSUM Table 1. Variables used in P-POSSUM and CR-POSSUM models Physiological score Age (years) Cardiac signs/chest radiograph Respiratory history/chest radiographa Systolic blood pressure (mmHg) Pulse (beats/min) Glascow Coma Scalea Hemoglobin (g/dl) White blood cell count (· 1012/L) Urea (mmol/L) Sodium (mmol/L)a Potassium (mmol/L)a Electrocardiograma Operative severity score Operative severity Multiple proceduresa Total blood loss (ml)a Peritoneal soiling Presence of malignancy Mode of surgery P-POSSUM: Portsmouth predictor equation for mortality. a Risk factors not used in scoring system specific for upper gastrointestinal surgery (CR-POSSUM).
and morbidity, particularly in young patients and for elective colorectal procedures. In a simulation study, both POSSUM systems were reported to underpredict outcome in the emergency setting and in the elderly colorectal population.5 Recently, a new scoring system (CR-POSSUM), specific for colorectal surgery, was designed based on the methods used by POSSUM and P-POSSUM.5 In the calculation of predicted mortality based on CR-POSSUM, six physiological factors and four operative severity factors were scored based on the data for the POSSUM system. CR-POSSUM is weighted on advancing age and surgery performed in an emergency setting5 (Table 1). The aim of the study was to evaluate the predictive accuracy of P-POSSUM and CR-POSSUM models on patients undergoing colorectal resection.
METHODS A total of 321 patients who underwent elective and emergency colorectal resection for colorectal cancer in Ankara Numune Education and Research Hospital between January 1998 and July 2004 were included (Table 2). Patients who underwent surgery without resection were excluded. Thus, 321 patients were scored retrospectively using CR-POSSUM and P-POSSUM
2267 Table 2. Demographic distribution of patients Patient characteristic Age, mean and range Sex (M/F) Location Right colon Transverse colon Left colon Sigmoid colon Rectum Multiple tumors Dukes’ staging A B C D Procedure Right hemicolectomy Transverse colectomy Left hemicolectomy Anterior resection Low anterior resection Abdominoperineal excision
No. of patients 58 (18–93)y 180/141 64 8 41 118 88 7 99 113 67 42 64 8 41 106 95 12
scoring systems. Mortality was defined as any death that occurred during the 30-day postoperative period. CRPOSSUM and P-POSSUM scores of the individual patients were calculated from the following web page (http:// www.riskprediction.org.uk) The P-POSSUM and CR-POSSUM scores were validated by assessing calibration and discrimination.6 Calibration (the ability of the model to assign the correct probabilities of outcome to individual patients) was assessed using the Hosmer-Lemeshow goodness-of-fit test and the corresponding calibration curves.7 Patients were stratified into risk groups depending on their predicted mortality. Smaller values represent better model calibration. Model discrimination (ability of the model to assign higher probabilities of outcome to patients who actually die than those who live) was measured by the area under the receiver-operator characteristic (ROC) curve (AUC) to evaluate how well the model distinguished patients experienced the event (death) from those who did not.8 In general, the AUC is used as an index of model discrimination; it ranges from 0.5 for chance performance to 1.0 for perfect prediction. Values ranging from 0.7 to 0.8 represent reasonable discrimination, and values exceeding 0.8 represent good discrimination. The following statistical software packages were used: STATA 8.0 for Windows (STATA, College Station, TX, USA) and Statistical Package for Social Sciences 10.0
2268
Tez et al.: Evaluation of P-POSSUM and CR-POSSUM
Table 3. Summary of model performance of the CR-POSSUM and P-POSSUM soring systems according to ROC analysis Score
AUC
CR-POSSUM P-POSSUM
0.675 0.611
AUC: area under the curve; P-POSSUM: Portsmouth predictor equation for mortality; CR-POSSUM: scoring system specific for upper colorectal surgery.
(SPSS, Chicago, IL, USA). A value of P < 0.05 was considered significant.
RESULTS The observed postoperative mortality was 6.6% (22/ 321). ROC curves analysis revealed that CR-POSSUM (AUC 0.775) has reasonable discriminatory power for mortality, whereas P-POSSUM (AUC 0.611) has no reasonable discriminatory power for mortality (Table 3). The number of deaths predicted by CR-POSSUM model was 25 compared to 22 reported deaths. There was no significant difference between the CR-POSSUM predicted and observed number of deaths (v2 = 14.61, 4 d.f., P = 0.13; Hosmer–Lemeshow v2 statistic ) (Table 4). The number of deaths predicted by P-POSSUM analysis was 29 compared to 22 reported deaths. The difference between the predicted and observed deaths was significant (v2 = 25.41, 4 d.f., P = 0.002; Hosmer–Lemeshow v2 statistic ) (Table 4).
DISCUSSION Some scores are ideal for assessing the risk of mortality and to a lesser extent morbidity in particular groups of patients who undergo surgery, such as those with cardiovascular and gastrointestinal disease. Others are useful in particular surgical settings, such as intensive care units.9 Examples of these systems include the American Society of Anesthesiologists (ASA) grade, Goldman Cardiac Risk Index, Prognostic Nutritional Index (PNI), the Acute Physiology and Chronic Health Evaluation II (APACHE II), and the POSSUM and P-POSSUM scoring systems. ASA grade and Goldman Cardiac Risk Index provide only an accurate assesment of cardiac risk for patients undergoing noncardiac surgical procedures. The PNI requires performance of delayed-type hypersensitivity tests and skinfold measurements and is
useful only when assessing patients who may need perioperative nutritional support. Probably the best known and most widely used scoring system is APACHE II, which is ideal for intensive care but requires 24 hours of observation and tables weighting for individual disease status.9 The POSSUM scoring system was developed from a multivariate discriminant analysis of factors measured in a broad group of general surgical patients. POSSUM score was found to overpredict death among general surgical patients, especially low-risk patients.3,4 Therefore P-POSSUM was developed in an attempt to improve the accuracy using the same physiological and operative variables.10 However, P-POSSUM, have been reported to overpredict mortality and morbidity, particularly in young patients and especially for elective colorectal procedures.5 To cope with this overprediction, Tekkis et al. designed a new scoring system for colorectal surgery and demonstrated that the model is an accrate predictor of operative mortality.5 From the original six factors comprising the POSSUM operative severity score, four were used in the CR-POSSUM model, these being operative severity, peritoneal soiling, mode of surgery, and clinicopathologic staging. The number of age group categories was expanded to include patients above the age of 80 as a separate risk group5,11 (Table 1). In addition, the ‘‘number of procedures’’ and ‘‘total blood loss’’ were excluded from the model because they were considered operator-dependent factors and indirectly described the structure and process of care.11 Thus, CR-POSSUM was found to have better calibration and discrimination than the existing POSSUM and p-POSSUM scoring systems.5 Application of the CR-POSSUM scoring system outside of UK practice is limited. In our small study group, we found that the P-POSSUM score overpredicted mortality by a factor of 1.31, and the observed/predicted mortality ratio was 1.13 for CR-POSSUM. Similarly, in the United States, Senagore et al. demonstrated that overprediction of operative mortality for the system by all three variants of POSSUM was based on a comparison of the outcome/expected outcome (O/E) ratios for each system (POSSUM-0.21, P-POSSUM-0.20, CR-POSSUM-0.45). They concluded that the differences between the UK and US health care systems are reflected by the relatively higher percentage of elderly patients, greater frequency of acute presentations, and greater share of higher Dukes’ stage patients in the United Kingdom. This may reflect a limitation of the application of these systems without modification in another health care system.11
Tez et al.: Evaluation of P-POSSUM and CR-POSSUM
2269
Table 4. Comparison of model performance of P-POSSUM and CR-POSSUM scoring systems Model performance (validation set) Study
Discriminationa(%)
Calibrationb
O/E mortality (%)
61.1 67.5
44.85, P = 0.002 42.20, P = 0.013
1.62 1.50
P-POSSUM CR-POSSUM
O/E: observed to expected mortality. a Discrimination is measured by the area under the receiver–operator characteristic curve (standard error). Higher values represent better model discrimination. b Calibration is measured by the Hosmer–Lemeshow v2 statistic (4 degreesx of freedom). Smaller values represent better mode calibration.
Finally patient numbers were small and this was a retrospective study in one Turkish tertiary referral hospital. Hence, the results need confirmation by a prospective multicenter evaluation.
6.
7.
REFERENCES 1. Parkin DM, Pisani P, Ferlay J. Estimates of worldwide incidence of eighteen major cancers in 1985. Int J Cancer 1993;54:594. 2. Cancer: Principles and Practice of Oncology, 6th edition. Lippincott Williams & Wilkins, Philadelphia, 2001. 3. Copeland GP, Jones D, Walters M. POSSUM: a scoring system for surgical audit. Br J Surg 1991;78:355–360. 4. Midwinter MJ, Tytherleigh M, Ashley S. Estimation of mortality and morbidity risk in vascular surgery using POSSUM and the Portsmouth predictor equation. Br J Surg 1999;86: 471–474. 5. Tekkis PP, Prytherch DR, Kocher HM, et al. Development of a dedicated risk-adjusment scoring system for colorectal
8.
9.
10.
11.
surgery (colorectal POSSUM). Br J Surg 2004;91:1174– 1182. Lemeshow S, Le Gall JR. Modeling the severity of illness of ICU patients: a systems update. JAMA 1994;272:1049– 1055. Lemeshow S, Hosmer DW Jr. A review of goodness of fit statistics for use in the development of logistic regression models. Am J Epidemiol 1982;115:92–106. Hanley JA, McNeil BJ. The meaning and the use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29–36. Gocmen E, Koc M, Tez M, et al. Evaluation of P-POSSUM and O-POSSUM scores in patients with gastric cancer undergoing resection. Hepatogastroenterology 2004;51: 1864–1866. Prytherch DR, Whiteley MS, Higgins B, et al. POSSUM and Portsmouth POSSUM for predicting mortality. Br J Surg 1998;85:1217–1220. Senagore AJ, Warmuth AJ, Delaney CP, et al. POSSUM, p-POSSUM, and Cr-POSSUM: implementation issues in a United States health care system for prediction of outcome for colon cancer resection. Dis Colon Rectum 2004;47: 1435–1441.