Pediatr Nephrol DOI 10.1007/s00467-015-3104-8
REVIEW
Biomarkers and surrogate endpoints in kidney disease Erum A. Hartung 1,2
Received: 13 January 2015 / Revised: 17 March 2015 / Accepted: 19 March 2015 # IPNA 2015
Abstract Kidney disease and its related comorbidities impose a large public health burden. Despite this, the number of clinical trials in nephrology lags behind many other fields. An important factor contributing to the relatively slow pace of nephrology trials is that existing clinical endpoints have significant limitations. BHard^ endpoints for chronic kidney disease, such as progression to endstage renal disease, may not be reached for decades. Traditional biomarkers, such as serum creatinine in acute kidney injury, may lack sensitivity and predictive value. Finding new biomarkers to serve as surrogate endpoints is therefore an important priority in kidney disease research and may help to accelerate nephrology clinical trials. In this paper, I first review key concepts related to the selection of clinical trial endpoints and discuss statistical and regulatory considerations related to the evaluation of biomarkers as surrogate endpoints. This is followed by a discussion of the challenges and opportunities in developing novel biomarkers and surrogate endpoints in three major areas of nephrology research: acute kidney injury, chronic kidney disease, and autosomal dominant polycystic kidney disease.
* Erum A. Hartung
[email protected] 1
Division of Nephrology, Children’s Hospital of Philadelphia, 34th and Civic Center Boulevard, Philadelphia, PA 19104, USA
2
Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, 415 Curie Blvd, Philadelphia, PA 19104, USA
Keywords Surrogate endpoints . Biomarkers . Acute kidney injury . Chronic kidney disease . End-stage renal disease . Polycystic kidney disease
Introduction Despite the large public healthcare burden imposed by kidney disease and its related comorbidities, the number of clinical trials in nephrology is relatively low compared to many other fields [1, 2]. An important contributor to the slow pace of nephrology trials is the difficulty posed by existing clinical endpoints. BHard^ endpoints for chronic kidney disease (CKD), such as progression to end-stage renal disease (ESRD), may not be reached for decades. In acute kidney injury (AKI), serum creatinine is a late and often insensitive marker of underlying injury. There is therefore growing interest in defining new biomarkers to serve as surrogate endpoints in kidney disease research. Surrogate endpoints can offer a number of potential advantages over true clinical endpoints and could expand opportunities for nephrology clinical trials. If a surrogate endpoint can be measured earlier in the disease process, it could allow for shorter trial durations, which could improve patient compliance and costeffectiveness [3]. In situations where the true clinical endpoint is severe morbidity or death, surrogate endpoints can allow the recruitment of patients with less severe illness, avoid the ethical dilemmas associated with waiting for a devastating clinical outcome [4], and avoid competing risks on the clinical endpoint from comorbid conditions [5]. In this paper, I first review key concepts related to the selection of clinical trial endpoints and discuss statistical and regulatory considerations related to the use of surrogate
Pediatr Nephrol
endpoints. This is followed by a discussion of the challenges and opportunities in developing novel biomarkers and surrogate endpoints in three major areas of nephrology research: AKI, CKD, and autosomal dominant polycystic kidney disease (ADPKD).
Definitions To frame this discussion, I first review the accepted definitions of clinical endpoints, surrogate endpoints, and biomarkers. The following definitions were proposed by the Biomarkers Definitions Working Group of the National Institutes of Health (NIH) in 2001: &
&
&
Clinical endpoint: A characteristic or variable that reflects how a patient feels, functions, or survives [4]. An example of a clinical endpoint in CKD research is the onset of ESRD, defined by initiation of maintenance dialysis or kidney transplantation [6]. Biomarker: A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention [4]. An example of a biomarker of acute kidney injury is neutrophil gelatinase-associated lipocalin (NGAL), a protein secreted by injured kidney tubule epithelial cells that can be measured in the plasma or urine [7]. Surrogate endpoint: A biomarker that is intended to substitute for a clinical endpoint. A surrogate endpoint is expected to predict clinical benefit (or harm or lack of benefit or harm) based on epidemiologic, therapeutic, pathophysiologic, or other scientific evidence [4]. An example of an accepted surrogate endpoint for progression of CKD to ESRD is doubling of serum creatinine [8].
Some biomarkers can be considered to be intermediate endpoints, defined as a biomarker that is intermediate in the causal pathway between an intervention and a clinical endpoint [9]. Decline in glomerular filtration rate (GFR) can be considered an intermediate endpoint because it is on the causal pathway to ESRD [9].
Considerations in the evaluation of biomarkers as surrogate endpoints It is important to remember that not all biomarkers are intermediate endpoints and that only a subset of biomarkers and intermediate endpoints will qualify as true surrogate endpoints. Before a biomarker can be used as a surrogate endpoint to replace a true clinical endpoint in clinical trials, it must undergo formal evaluation process. An NIH workshop in
2001 reviewed considerations in the selection of surrogate endpoints in clinical trials [10]. Aside from ensuring biological plausibility of the surrogate, a statistical framework must be in place to evaluate candidate biomarkers from exploratory phases through confirmatory clinical trials. The first part of this section is a review of the ideal biologic characteristics of a surrogate endpoint and of the potential pitfalls in surrogate endpoint selection; the second part is a review of statistical concepts and regulatory issues in the evaluation of biomarkers as surrogate endpoints.
Biological plausibility The ideal characteristics of a surrogate endpoint were first defined by Prentice in 1989 as follows: a surrogate should correlate with the true clinical outcome and capture the full relationship between the treatment and the true clinical endpoint [5]. In this ideal scenario, the surrogate endpoint lies in the only causative pathway of the disease process, and the effect of any intervention on the true clinical endpoint is captured entirely by the effect on the surrogate endpoint [11]. This scenario is illustrated in Fig. 1a. In reality, however, the strict criteria proposed by Prentice can be difficult to meet, and many surrogate endpoints can have potential pitfalls. Fleming and DeMets [11] reviewed how surrogate endpoints in clinical trials can be misleading. Importantly, they emphasize that Ba correlate does not a surrogate make^ [11]. This means that a biomarker can be strongly correlated with clinical outcome, yet it can fail as a surrogate endpoint if it is not in the causal pathway of the disease process. Here, the framework outlined by Fleming and DeMets (Fig. 1b–e) is used to review reasons for failure of surrogate endpoints and provide some examples specific to kidney disease research. First, if a surrogate endpoint is not in the causal pathway of disease progression, an intervention targeted at this surrogate endpoint may not affect the true clinical outcome (Fig. 1b). For example, CKD in children is often correlated with poor growth. However, poor growth is not part of the causative biologic process that leads to ESRD. Thus, an intervention targeted at improving the surrogate endpoint (growth) will have no effect on the true clinical outcome (ESRD). Second, a surrogate may represent only one of several potential pathways influencing disease progression (Fig. 1c). Consider the example of using proteinuria as a surrogate endpoint for progression of autosomal dominant polycystic kidney disease (ADPKD) to ESRD. Although increasing proteinuria correlates with worse kidney function in ADPKD [12], interventions to reduce proteinuria [such as angiotensin converting enzyme (ACE) inhibitors] may not delay progression to ESRD due to the multiple other mechanisms causing cyst growth and kidney function decline.
Pediatr Nephrol Fig. 1 Characteristics of an ideal surrogate endpoint (a), and reasons for failure of surrogate endpoints (b–e). See text for examples of each scenario in kidney disease research. (Adapted from Fleming and Demets [11]; used with permission)
Third, an intervention may alter the true clinical outcome in a manner independent of its effect on the surrogate endpoint (Fig. 1d). Consider the example of using hypertension as a surrogate endpoint in CKD caused by glomerulonephritis. An intervention such as steroids may have a beneficial effect on the true clinical outcome (ESRD), but could have a detrimental effect on the surrogate endpoint of hypertension. Finally, an intervention can have effects on the clinical outcome through mechanisms that are independent of the primary disease process (Fig. 1e). A recent study showed that the lipid-lowering agent pravastatin slowed the increase in total kidney volume (TKV) in ADPKD [13], where TKV was considered to be a surrogate endpoint for decline in kidney function. Let us now consider what could happen if a trial of pravastatin in ADPKD used all-cause mortality as the primary clinical outcome. A decline in kidney function contributes to mortality risk in ADPKD; thus, in theory, the drug’s beneficial renal effects could translate to lower mortality risk. However, it is also known that pravastatin can affect mortality risk
through non-renal effects such as lipid lowering, improvement in endothelial dysfunction [14], or lowering of left ventricular mass [15]. Therefore, the effect of the intervention on the clinical outcome can be mediated through multiple mechanisms that may not be fully captured by its effect on the surrogate endpoint. Statistical considerations A number of statistical methods have been described to validate surrogate endpoints, and there is ongoing debate regarding optimal methods. Although a complete review of all these controversies is beyond the scope of this review, some of the key statistical methods that have been described to evaluate surrogate endpoints are worthy of discussion. As mentioned, Prentice [5] published one of the key papers that first outlined a statistical framework to validate surrogate endpoints. However, the Prentice criteria as originally described have been perceived as too stringent and thus not
Pediatr Nephrol
achievable under many circumstances [16, 17]. Therefore, a number of authors have proposed alternative approaches for statistical validation. Freedman et al. [18] extended Prentice’s approach by introducing the concept of Bproportion of treatment effect explained^ (PTE), which is the proportion of the treatment effect on the true clinical endpoint that is mediated by the surrogate. Within this framework, the PTE of an ideal surrogate would equal 1. Although the PTE has been viewed by some authors as helpful in certain situations [19, 20], others have raised a number of concerns, such as a high level of variability [21, 22], potential for bias [20], and difficulty in separating intended drug effects from unintended adverse effects [23]. These problems make the PTE useful only in moderate to large studies with moderate to large treatment effects—situations in which there may be little need to establish a surrogate endpoint. Some authors have therefore discouraged the use of the PTE [24]. To overcome problems with the use of the PTE, Buyse and Molenberghs [25] proposed the use of two related quantities: the relative effect, which is the ratio of treatment effects on the clinical and surrogate endpoints, and the adjusted association, which assesses the treatment-adjusted association between the surrogate and clinical endpoints. The use of these quantities in meta-analysis of multiple trials allows both the trial-level and individual-level validity of a surrogate marker to be ascertained [17]. Further work by these authors and others has established a number of meta-analytic methods to assess trial- and individual-level validity of surrogates across a range of endpoint types (e.g., binary, ordinal, continuous, longitudinally measured, time to event) [23, 26–29].
Evaluating the added prognostic impact of a new biomarker When a novel biomarker is being evaluated, an important consideration is how that biomarker adds to existing biomarkers and risk factors in predicting the clinical outcome of interest. A popular method to determine the predictive value of a biomarker (or set of biomarkers) is the area under the receiver-operating-characteristic curve (AUC) [30]. One criticism of AUC analysis, however, is that it can be relatively insensitive to new information, meaning a new biomarker would need to have a large independent association with the clinical outcome to result in a meaningful increase in AUC [31]. To overcome this limitation, Pencina et al. [32] proposed two novel approaches, namely, net reclassification improvement (NRI) and integrated discrimination improvement (IDI), and these methods have become increasingly popular. However, several authors have raised concerns about the reliability of the NRI and IDI [33–35] and urge caution when using these methods.
Regulatory considerations Under the Federal Food, Drug, and Cosmetic Act [36], approval of a drug by the United States (US) Food and Drug Administration (FDA) requires Bsubstantial evidence [consisting of adequate and well-controlled clinical investigations] that the drug will have the effect it … is represented to have under the conditions of use prescribed, recommended, or suggested in the proposed labeling^ [36]. The law does not explicitly address what endpoints provide acceptable evidence of effectiveness. However, given that a drug must also show that it is Bsafe^ (i.e., have a favorable risk–benefit profile), its effects must be of meaningful clinical value [37]. Thus, a drug with effects on a surrogate endpoint that does not correspond to a clinical benefit could not meet the safety standard. The use of surrogate endpoints was addressed in the 1992 FDA Baccelerated approval^ regulation [38], which applies to new drugs Bthat have been studied for their safety and effectiveness in treating serious or life-threatening illnesses and that provide meaningful therapeutic benefit to patients over existing treatments^ [38]. This regulation states that Bthe FDA may grant marketing approval for a new drug product on the basis of adequate and well-controlled clinical trials establishing that the drug product has an effect on a surrogate endpoint that is reasonably likely, based on epidemiologic, therapeutic, pathophysiologic, or other evidence, to predict clinical benefit or on the basis of an effect on a clinical endpoint other than survival or irreversible morbidity.^ Recognizing that there could be Buncertainty as to the relation of the surrogate endpoint to clinical benefit,^ the rule further requires post-marketing studies to Bverify and describe the drug’s clinical benefit^ [38]. The FDA regulations do not specifically address statistical considerations related to the evaluation of a biomarker as a surrogate endpoint. The International Conference on Harmonisation (ICH) Harmonised Tripartite Guideline on Statistical Principles for Clinical Trials E9, adopted by regulatory agencies in the USA, Europe, and Japan, outlines general criteria for validating surrogate endpoints. These include ensuring the Bbiological plausibility of the relationship,^ demonstrating the Bprognostic value of the surrogate for the clinical outcome^ in epidemiological studies, and ensuring that Btreatment effects on the surrogate correspond to effects on the clinical outcome^ in clinical trials [39]. Of note, the E9 Guideline does not advocate specific statistical methods and states that experience with statistical criteria is Brelatively limited^ [39]. Recognizing that biomarkers can play a critical role in accelerating drug development, the Center for Drug Evaluation and Research at the FDA introduced the Biomarker Qualification Program (BQP) [40] in 2009. Part of the FDA’s Critical Path Initiative [41] to spur innovation and facilitate drug development, the BQP provides a defined
Pediatr Nephrol
pathway by which an individual or organization can submit a proposal for a biomarker to be evaluated for use in the regulatory process. Biomarker qualification is defined by the FDA as the conclusion that Bwithin the stated context of use, the results of assessment with a biomarker can be relied upon to have a specific interpretation and application in drug development and regulatory review^ [42]. The Bcontext of use^ defines the specific circumstances under which the biomarker is qualified and helps to determine the type of evidence that is needed to support qualification [43]. Once a biomarker receives qualification for a specific context of use, it can be used by drug developers for other applications without needing repeat review [42].
Why do we need new surrogate endpoints in kidney disease research? All existing clinical and surrogate endpoints in kidney disease trials have limitations. In this section, I review the challenges posed by the endpoints currently used in nephrology and discuss the development of new biomarkers and surrogate endpoints, with a focus on three major areas of nephrology research: AKI, CKD, and ADPKD. Table 1 provides an overview of the advantages and disadvantages of the various nephrology endpoints and biomarkers. Acute kidney injury A number of groups have developed consensus clinical criteria defining AKI, including the Risk, Injury, Failure, Loss of kidney function, and End-stage renal disease (RIFLE), Acute Kidney Injury Network (AKIN), and Kidney Disease: Improving Global Outcomes (KDIGO) [44–46]. These AKI staging criteria rely on changes in serum creatinine and urine output, making them easy to apply in most clinical settings. However, serum creatinine has two significant drawbacks as a biomarker of AKI: (1) creatinine rise is delayed relative to the onset of actual injury and (2) creatinine level is influenced by a number of extra-renal factors (such as muscle mass, medications, and hydration status) [47]. Despite this, the RIFLE, AKIN, and KDIGO criteria for AKI have great value in ensuring uniform clinical reporting and in providing consistent endpoints for clinical trials. To try to overcome the limitations of creatinine as an AKI biomarker, there has been tremendous interest in the last decade in developing new serum and urine biomarkers to detect renal injury. Examples of these include NGAL, kidney injury molecule-1, interleukin-18, and liver-type fatty acid binding protein [47]. Most notably, the FDA recently approved the first point-of-care device to use novel biomarkers to assess risk of AKI [48, 49]. Based on measurement of urinary tissue
inhibitor of metalloproteinase-2 (TIMP-2) and insulin-like growth factor binding protein 7 (IGFBP-7), the device assesses the risk of developing moderate to severe AKI (KDIGO Stage 2 to 3) within 12 h of sample collection [50]. (Note, however, that FDA approval of the testing device does not imply that the measured biomarkers are qualified for use as surrogate endpoints.) Although AKI criteria such as RIFLE, AKIN, and KDIGO and biomarkers such as TIMP-2 and IGFBP-7 can identify renal injury in the short term, it is important to recognize that they may not always translate into clinically meaningful outcomes, such as new dialysis dependence, development of CKD, or death [51]. Therefore, the inclusion of these Bhard^ clinical endpoints has been advocated for clinical trials in AKI. However, these hard endpoints also have their own limitations—timing of dialysis initiation is subject to clinical practice variation, and longer-term or more severe outcomes, such as development of CKD or death, do not provide an opportunity for earlier intervention. In addition, hard endpoints occur less commonly than AKI identified by staging criteria, reducing the statistical power of studies based on hard endpoints. To increase statistical power and provide a more universal indicator of clinical outcome following AKI, Billings and Shaw [51] propose the use of a composite outcome of death, new dialysis, and worsened kidney function (defined as ≥25 % decline in GFR), termed Bmajor adverse kidney events^ (MAKE), for all effectiveness clinical trials in AKI. Clinical endpoints such as MAKE will also be important for any future validation studies of AKI biomarkers, such as TIMP-2 and IGFBP-7. Until novel biomarkers have been validated and have met criteria to qualify as true surrogate endpoints, it will be important for AKI trials to continue to use hard clinical endpoints to assess patient outcomes. Chronic kidney disease Progression of CKD is assessed clinically by a decline in estimated GFR (eGFR) using age-appropriate estimating equations that are most commonly based on serum creatinine. The ultimate Bhard^ clinical endpoint for CKD progression is development of ESRD, which is often defined as a new initiation of renal replacement therapy. However, as discussed earlier, defining ESRD in this manner can lead to inconsistencies due to practice variations in the timing of dialysis initiation and transplantation. A definition of ESRD that also incorporates eGFR (i.e., eGFR<15 mL/min/1.73 m2 for >3 months) [52] can therefore provide greater uniformity. Recognizing that progression to ESRD may occur over years or even decades, the FDA has historically accepted the doubling of serum creatinine [which corresponds to an approximately 57 % decline in eGFR based on the CKD Epidemiology Collaboration (CKD–EPI) equation] as a surrogate endpoint for the development of kidney failure.
Microalbuminuria
Proteinuria
30 % decline in eGFR
Doubling of serum creatinine (57 % eGFR decline based on CKD-EPI equation) 40 % decline in eGFR
NGAL, KIM-1, IL-18, L-FABP, TIMP-2, IGFBP-7, etc.
Proposed surrogate endpoint
Total kidney volume
Autosomal dominant polycystic kidney disease Existing surrogate endpoint GFR decline (various definitions)
Proposed surrogate endpoints
Chronic kidney disease Established surrogate endpoint
Novel biomarkers
Condition-specific biomarkers and surrogate endpoints Acute kidney injury Established surrogate endpoint Serum creatinine and urine output (in KDIGO, RIFLE, AKIN criteria)
Death
BHard^ clinical endpoints (AKI, CKD, or ADPKD) End-stage renal disease
Specific examples of biomarker or endpoint
More sensitive than GFR in detecting changes over short follow-up; correlates with GFR decline and predicts risk of Stage 3 CKD
On causal pathway to ESRD; clinically meaningful
Early marker of renal injury; allows potential for earlier intervention
Earlier endpoint than doubling of creatinine; on causal pathway to ESRD; more common endpoint can increase statistical power of trials Earlier endpoint than 40 % eGFR decline, allowing shorter trials; more common endpoint increases statistical power Easy to measure; useful for certain disease states (e.g., nephrotic syndrome) or specific drugs (e.g., ACE-I/ ARBs)
Easy to measure, on causal pathway to ESRD
Earlier marker of injury than creatinine; opportunity for earlier intervention; more common endpoint than ESRD/death, can increase statistical power; pointof-care device available to test TIMP-2 and IGFBP-7
Widely available; easy to measure and apply
Clinically meaningful outcome Definition based on eGFR<15 mL/min/1.73 m2 is more uniform than one based on dialysis/transplant Clinically meaningful outcome
Advantages
Overview of the advantages and disadvantages of the various nephrology endpoints and biomarkers
Biomarker or endpoint
Table 1
Early ADPKD characterized by hyperfiltration and normal GFR despite underlying cyst progression Unclear if interventions to reduce TKV necessarily improve clinical outcome
Unclear association with clinically relevant outcomes such as ESRD
Provides less direct assessment of ESRD risk; higher Type 1 error rate; shorter trials due to earlier endpoint could miss safety concerns Not part of causal pathway of many diseases; unclear if treatment effects on proteinuria predict effects on renal outcome
May increase Type 1 error rate, especially if large acute effects of intervention
Relatively late event in CKD progression
Influenced by extra-renal factors (e.g., muscle mass, hydration status); change in creatinine lags behind actual injury Less evidence of correlation with clinically meaningful outcomes
Definition based on timing of dialysis or transplantation subject to practice variation; late/severe complication Subject to competing risks; devastating outcome, no opportunity for earlier intervention; less common endpoint, reduces statistical power
Disadvantages
Pediatr Nephrol
ACE-I, Angiotensin converting enzyme inhibitor; ADPKD, autosomal dominant polycystic kidney disease; AKI, acute kidney injury; AKIN, Acute Kidney Injury Network; ARB, angiotensin receptor blocker; CKD, chronic kidney disease; CKD-EPI, CKD-CKD Epidemiology Collaboration; eGFR, estimated glomerular filtration rate; ESRD, end-stage renal disease; IGFBP-7, insulin-like growth factor binding protein 7; IL-18, interleukin-18; KDIGO, Kidney Disease: Improving Global Outcomes; KIM-1, kidney injury molecule-1; L-FABP, liver-type fatty acid binding protein; RIFLE, Risk, Injury, Failure, Loss of Kidney Function criteria; M-CSF, macrophage colony stimulating factor; MCP-1, monocyte chemoattractant protein-1; NGAL, neutrophil gelatinase-associated lipocalin; TIMP-2, tissue inhibitor of metalloproteinase-2; TKV, total kidney volume
Unclear association with clinically relevant outcomes Potentially even earlier markers of progression than TKV or GFR Urinary biomarkers (e.g., NGAL, M-CSF, MCP-1); markers of endothelial dysfunction (e.g., pentraxin 3) or vasopressin signaling (e.g., copeptin); urinary proteomic analysis; renal blood flow Novel biomarkers
Biomarker or endpoint
Table 1 (continued)
Specific examples of biomarker or endpoint
Advantages
Disadvantages
Pediatr Nephrol
However, this surrogate endpoint is also a relatively late event in the course of CKD progression, and also necessitates trials with large sample sizes and many years of follow-up [53]. Therefore, there has been growing interest in using lower levels of eGFR decline as surrogate endpoints. A number of studies have evaluated eGFR declines of 30 and 40 % as alternative surrogate endpoints [6, 54–56]. In December 2012, the National Kidney Foundation (NKF) and the FDA cosponsored a workshop to evaluate these lower thresholds as potential surrogate endpoints [53]. Using a combination of data from observational cohorts, clinical trials, and simulations, participants in this workshop evaluated how the selection of eGFR declines of 30 or 40 % as surrogate endpoints would affect the Type 1 error rate and power of studies compared to the established eGFR decline endpoint of 57 % [53]. By showing that an endpoint of 40 % eGFR decline improved statistical power without excessive increase in Type 1 error, the conclusion drawn was that this would be an acceptable surrogate endpoint for CKD trials in a range of circumstances. The 30 % decline endpoint also performed well in various conditions, but led to increased Type 1 error rate in situations in which the intervention had acute effects on eGFR [53]. The results of this workshop provide an important illustration of considerations involved in selecting surrogate endpoints and demonstrates how the performance of a particular surrogate endpoint will depend on the context. The 30 and 40 % eGFR decline endpoints can offer the advantage of being earlier and more common markers of deteriorating kidney function, potentially allowing smaller and shorter clinical trials. The relevance of these findings to pediatric nephrology were recently discussed in an editorial commentary by Schnaper et al. [57]. However, the FDA cautions that using these smaller eGFR declines as endpoints, particularly in diabetic nephropathy trials, will provide less direct information on how therapies affect the risk of ESRD [8]. In addition, shorter trials may reduce the power to detect rare safety events and provide less information on the long-term safety and efficacy of interventions [8, 58] . Given the inherent limitations of serum creatinine, there is also greater interest in using other biomarkers of renal damage to assess CKD progression. Proteinuria and microalbuminuria have been the subject of most of the discussion on alternative endpoints. In 2009, the NKF and FDA cosponsored a workshop to examine proteinuria as a surrogate outcome in CKD [9]. Given the heterogeneity of kidney diseases that can result in proteinuria, a major concern is that proteinuria is not necessarily part of the causal biologic pathway in many diseases. Therefore, the participants of this workgroup concluded that proteinuria was only acceptable as a surrogate endpoint for a limited number of disease states (e.g., complete remission of proteinuria in nephrotic syndrome) or for the evaluation of
Pediatr Nephrol
specific drugs (e.g., reduction in mild to moderate proteinuria to assess effects of ACE inhibitors or angiotensin receptor blockers) [9, 59]. The use of microalbuminuria as an endpoint has also been controversial. Microalbuminuria has been associated with cardiovascular and renal risk in diabetes and hypertension [60]. By virtue of being an early marker of end-organ damage, microalbuminuria may offer the advantage of being able to assess the effects of early interventions to delay CKD progression [61]. However, it remains unclear whether therapies to decrease microalbuminuria necessarily improve important clinical outcomes, such as the development of ESRD [62]. Further studies will therefore be needed before microalbuminuria can be established as a surrogate endpoint in CKD. Autosomal dominant polycystic kidney disease Autosomal dominant polycystic kidney disease causes bilateral, progressively enlarging kidney cysts, as well as a number of extrarenal manifestations, including liver cysts and intracranial aneurysms. Kidney function declines over the course of decades and ultimately leads to ESRD in a significant proportion of patients. Despite progressive growth of the kidney cysts over patients’ lifetimes, the early course of ADPKD is actually characterized by hyperfiltration and relatively normal GFR for many decades [63, 64]. This makes GFR an insensitive marker of underlying renal parenchymal damage in ADPKD. To address the limitations of GFR as an endpoint in ADPK D trials, the National Institute of Diabetes and Digestive and Kidney Diseases sponsored the Consortium for Radiologic Imaging Studies of Polycystic Kidney Disease (CRISP). Using magnetic resonance imaging (MRI), the CRISP studies established that TKV increases as eGFR declines in ADPKD [65, 66], and that baseline TKV predicts risk of developing Stage 3 CKD (GFR<60 mL/min/1.73 m2) [64]. These correlations with GFR decline, and the ability to detect changes in TKV over relatively short follow-up periods, have led some authors to propose TKV as a surrogate endpoint for renal disease progression in ADPKD [64]. Acceptance of this endpoint would allow trials to assess potential therapies earlier in the disease course, before extensive renal parenchymal damage has occurred. However, from a regulatory perspective, the FDA has expressed concern about the use of TKV as a surrogate endpoint. When change in TKV was proposed as the primary endpoint for the TEMPO 3/4 trial of tolvaptan for ADPKD [67, 68], the FDA cautioned that Bthere is no intervention to alter renal volume that is known to affect renal function, so it is hard to accept renal volume as a surrogate^ [69]. Indeed, the FDA determined that the key efficacy endpoint it would consider for approval would be the composite secondary endpoint
consisting of clinical factors (hypertension, renal pain, albuminuria, and renal function), rather than change in TKV [67]. It is therefore important that ongoing trials in ADPKD continue to assess treatment efficacy based on key clinical endpoints in addition to TKV. A number of other biomarkers have also been evaluated in ADPKD. These include urinary biomarkers such as NGAL, macrophage colony stimulating factor, monocyte chemoattractant protein-1 [70]; markers of endothelial dysfunction such as pentraxin 3 [71]; markers of vasopressin signaling such as copeptin [72]; urinary proteomic biomarkers [73]; renal blood flow measured by MRI [74]. There is hope that some of these biomarkers will provide more sensitive indicators of ADPKD progression than TKV or eGFR, particularly if they measure factors related to underlying pathophysiology, such as tubular damage, inflammation, or cyst growth [75]. However, further studies are needed to determine if any of these biomarkers predict clinically meaningful endpoints, and TKV remains the most well-characterized surrogate endpoint for ADPKD progression.
Summary Improving the pace of clinical trials in nephrology will require the development of new biomarkers and surrogate endpoints to overcome the limitations posed by existing clinical endpoints. Validating a new biomarker as a surrogate endpoint requires a comprehensive biological, statistical, and regulatory framework for evaluation. Recent developments in nephrology research include novel biomarkers of renal parenchymal damage in AKI (e.g., TIMP-2 and IGFBP-7), acceptance of lower levels of eGFR decline in CKD, and use of TKV to assess progression of ADPKD. Although many of these biomarkers appear to be promising candidates to serve as surrogate endpoints, further studies will be needed to validate them against key clinical outcomes. Acknowledgments Dr. Hartung is supported by the National Center for Advancing Translational Sciences (NCATS) of the National Institutes of Health (NIH) under Award Number KL2TR000139. The content is solely the responsibility of the author and does not necessarily represent the official view of NCATS or the NIH. Conflict of interest There are no conflicts of interest.
References 1. 2.
Himmelfarb J (2007) Chronic kidney disease and the public health: gaps in evidence from interventional trials. JAMA 297:2630–2633 Strippoli GFM, Craig JC, Schena FP (2004) The number, quality, and coverage of randomized controlled trials in nephrology. J Am Soc Nephrol 15:411–419
Pediatr Nephrol 3.
4.
5. 6.
7.
8.
9.
10.
11. 12.
13.
14. 15.
16.
17.
18.
19.
20.
21.
22.
Boissel JP, Collet JP, Moleur P, Haugh M (1992) Surrogate endpoints: a basis for a rational approach. Eur J Clin Pharmacol 43: 235–244 Biomarkers Definitions Working Group (2001) Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin Pharmacol Ther 69:89–95 Prentice RL (1989) Surrogate endpoints in clinical trials: definition and operational criteria. Stat Med 8:431–440 Inker LA, Lambers Heerspink HJ, Mondal H, Schmid CH, Tighiouart H, Noubary F, Coresh J, Greene T, Levey AS (2014) GFR decline as an alternative end point to kidney failure in clinical trials: a meta-analysis of treatment effects from 37 randomized trials. Am J Kidney Dis 64:848–859 Haase-Fielitz A, Haase M, Devarajan P (2014) Neutrophil gelatinaseassociated lipocalin as a biomarker of acute kidney injury: a critical evaluation of current status. Ann Clin Biochem 51:335–351 Thompson A, Lawrence J, Stockbridge N (2014) GFR decline as an end point in trials of CKD: a viewpoint from the FDA. Am J Kidney Dis 64:836–837 Levey AS, Cattran D, Friedman A, Miller WG, Sedor J, Tuttle K, Kasiske B, Hostetter T (2009) Proteinuria as a surrogate outcome in CKD: report of a scientific workshop sponsored by the national kidney foundation and the US food and drug administration. Am J Kidney Dis 54:205–226 De Gruttola VG, Clax P, DeMets DL, Downing GJ, Ellenberg SS, Friedman L, Gail MH, Prentice R, Wittes J, Zeger SL (2001) Considerations in the evaluation of surrogate endpoints in clinical trials. Summary of a national institutes of health workshop. Control Clin Trials 22:485–502 Fleming TR, Demets DL (1996) Surrogate end points in clinical trials: are we being misled? Ann Intern Med 125:605–613 Schrier RW, Brosnahan G, Cadnapaphornchai MA, Chonchol M, Friend K, Gitomer B, Rossetti S (2014) Predictors of autosomal dominant polycystic kidney disease progression. J Am Soc Nephrol 25:2399–2418 Cadnapaphornchai MA, George DM, McFann K, Wang W, Gitomer B, Strain JD, Schrier RW (2014) Effect of pravastatin on total kidney volume, left ventricular mass index, and microalbuminuria in pediatric autosomal dominant polycystic kidney disease. Clin J Am Soc Nephrol 9:889–896 Laufs U (2003) Beyond lipid-lowering: effects of statins on endothelial nitric oxide. Eur J Clin Pharmacol 58:719–731 Nishikawa H, Miura S, Zhang B, Shimomura H, Arai H, Tsuchiya Y, Matsuo K, Saku K (2004) Statins induce the regression of left ventricular mass in patients with angina. Circ J 68:121–125 Fleming TR, Prentice RL, Pepe MS, Glidden D (1994) Surrogate and auxiliary endpoints in clinical trials, with potential applications in cancer and AIDS research. Stat Med 13:955–968 Buyse M, Molenberghs G, Burzykowski T, Renard D, Geys H (2000) The validation of surrogate endpoints in meta-analyses of randomized experiments. Biostatistics 1:49–67 Freedman LS, Graubard BI, Schatzkin A (1992) Statistical validation of intermediate endpoints for chronic diseases. Stat Med 11: 167–178 Mildvan D, Landay A, De Gruttola V, Machado SG, Kagan J (1997) An approach to the validation of markers for use in AIDS clinical trials. Clin Infect Dis 24:764–774 Bycott PW, Taylor JM (1998) An evaluation of a measure of the proportion of the treatment effect explained by a surrogate marker. Control Clin Trials 19:555–568 Lin DY, Fleming TR, De Gruttola V (1997) Estimating the proportion of treatment effect explained by a surrogate marker. Stat Med 16:1515–1527 De Gruttola V, Fleming T, Lin DY, Coombs R (1997) Perspective: validating surrogate markers—are we being naive? J Infect Dis 175: 237–246
23. 24. 25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37. 38.
39.
40.
41.
42.
Weir CJ, Walley RJ (2006) Statistical evaluation of biomarkers as surrogate endpoints: a literature review. Stat Med 25:183–203 Flandre P, Saidi Y (1999) Estimating the proportion of treatment effect explained by a surrogate marker. Stat Med 18:107–109 Buyse M, Molenberghs G (1998) Criteria for the validation of surrogate endpoints in randomized experiments. Biometrics 54:1014– 1029 Alonso A, Molenberghs G, Burzykowski T, Renard D, Geys H, Shkedy Z, Tibaldi F, Abrahantes JC, Buyse M (2004) Prentice’s approach and the meta-analytic paradigm: a reflection on the role of statistics in the evaluation of surrogate endpoints. Biometrics 60: 724–728 Alonso A, Van der Elst W, Molenberghs G, Buyse M, Burzykowski T (2014) On the relationship between the causal-inference and meta-analytic paradigms for the validation of surrogate endpoints. Biometrics. doi:10.1111/biom.12245 Buyse M, Molenberghs G, Paoletti X, Oba K, Alonso A, Van der Elst W, Burzykowski T (2015) Statistical evaluation of surrogate endpoints with examples from cancer clinical trials. Biom J. doi:10. 1002/bimj.201400049 Burzykowski T, Buyse M (2006) Surrogate threshold effect: an alternative measure for meta-analytic surrogate endpoint validation. Pharm Stat 5:173–186 Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36 Greenland P, O’Malley PG (2005) When is a new prediction marker useful? A consideration of lipoprotein-associated phospholipase A2 and C-reactive protein for stroke risk. Arch Intern Med 165:2454– 1456 Pencina MJ, D’Agostino RB, Vasan RS (2008) Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 27:157–172, discussion 207–212 Hilden J, Gerds TA (2014) A note on the evaluation of novel biomarkers: do not rely on integrated discrimination improvement and net reclassification index. Stat Med 33:3405–3014 Kerr KF, McClelland RL, Brown ER, Lumley T (2011) Evaluating the incremental value of new biomarkers with integrated discrimination improvement. Am J Epidemiol 174:364–374 Kerr KF, Wang Z, Janes H, McClelland RL, Psaty BM, Pepe MS (2014) Net reclassification indices for evaluating risk prediction instruments: a critical review. Epidemiology 25:114–121 Federal Food, Drug, and Cosmetic Act. 2013, U.S.C. Sec. 355. http://www.gpo.gov/fdsys/pkg/USCODE-2013-title21/pdf/ USCODE-2013-title21-chap9-subchapV-partAsec355.pdf Temple R (1999) Are surrogate markers adequate to assess cardiovascular disease drugs? JAMA 282:600–604 [No authors listed] (1992) New drug, antibiotic, and biological drug product regulations; accelerated approval—FDA. Final rule. Fed Regist 57:58942–58960 International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (1998) ICH harmonised tripartite guideline: statistical principles for clinical trials E9. Fed Regist 63:49583 U.S. Food and Drug Administration (2009) FDA/CDER Biomarker Qualification Program. Available at: http://www.fda.gov/Drugs/ DevelopmentApprovalProcess/DrugDevelopmentToolsQualification Program/ucm284076.htm. Accessed 10 Mar 2015 U.S. Food and Drug Administration (2004) FDA Critical Path Initiative. Available at: http://www.fda.gov/scienceresearch/ specialtopics/criticalpathinitiative/default.htm. Accessed 10 Mar 2015 Amur S (2013) Biomarker qualification at CDER/FDA. In: EMAFDA Webinar. Available at: http://www.imi.europa.eu/sites/default/
Pediatr Nephrol files/uploads/documents/Webinar/IMIwebinaronregulatoryacceptance/ 5_-_Shashi_Amur[1].pdf. Accessed 9 Mar 2015 43. U.S. Food and Drug Administration (2014) Guidance for industry and FDA staff: qualification process for drug development tools. Available at: http://www.fda.gov/downloads/Drugs/GuidanceCompliance RegulatoryInformation/Guidances/UCM230597.pdf. Accessed 10 Mar 2015 44. Bellomo R, Ronco C, Kellum JA, Mehta RL, Palevsky P; Acute Dialysis Quality Initiative workgroup (2004) Acute renal failure— definition, outcome measures, animal models, fluid therapy and information technology needs: the second international consensus conference of the acute dialysis quality initiative (ADQI) group. Crit Care 8:R204–R212 45. Mehta RL, Kellum JA, Shah SV, Molitoris BA, Ronco C, Warnock DG, Levin A (2007) Acute kidney injury network (2007) acute kidney injury network: report of an initiative to improve outcomes in acute kidney injury. Crit Care 11:R31 46. Kidney Disease: Improving Global Outcomes (KDIGO) Acute Kidney Injury Work Group (2012) KDIGO clinical practice guideline for acute kidney injury. Kidney Int Suppl 2:1–138 47. Devarajan P, Murray P (2014) Biomarkers in acute kidney injury: are we ready for prime time? Nephron Clin Pract 127:176–179 48. Endre ZH, Pickering JW (2014) Cell cycle arrest biomarkers win race for AKI diagnosis. Nat Rev Nephrol 10:683–685 49. U.S. Food and Drug Administration (2014) FDA news release. FDA allows marketing of the first test to assess risk of developing acute kidney injury. Available at: http://www.fda.gov/NewsEvents/ Newsroom/PressAnnouncements/ucm412910.htm. Accessed 11 Mar 2015 50. Kashani K, Al-Khafaji A, Ardiles T, Artigas A, Bagshaw SM, Bell M, Bihorac A, Birkhahn R, Cely CM, Chawla LS, Davison DL, Feldkamp T, Forni LG, Gong MN, Gunnerson KJ, Haase M, Hackett J, Honore PM, Hoste EA, Joannes-Boyau O, Joannidis M, Kim P, Koyner JL, Laskowitz DT, Lissauer ME, Marx G, McCullough PA, Mullaney S, Ostermann M, Rimmelé T, Shapiro NI, Shaw AD, Shi J, Sprague AM, Vincent JL, Vinsonneau C, Wagner L, Walker MG, Wilkerson RG, Zacharowski K, Kellum JA (2013) Discovery and validation of cell cycle arrest biomarkers in human acute kidney injury. Crit Care 17:R25 51. Billings FT IV, Shaw AD (2014) Clinical trial endpoints in acute kidney injury. Nephron Clin Pract 127:89–93 52. Kidney Disease: Improving Global Outcomes (KDIGO) (2013) KDIGO 2012 clinical practice guideline for the evaluation and management of chronic kidney disease. Kidney Int Suppl 3:1–150 53. Levey AS, Inker LA, Matsushita K, Greene T, Willis K, Lewis E, de Zeeuw D, Cheung AK, Coresh J (2014) GFR decline as an end point for clinical trials in CKD: a scientific workshop sponsored by the National Kidney Foundation and the US Food and Drug Administration. Am J Kidney Dis 64:821–835 54. Lambers Heerspink HJ, Weldegiorgis M, Inker LA, Gansevoort R, Parving HH Dwyer JP, Mondal H, Coresh J, Greene T, Levey AS, de Zeeuw D (2014) Estimated GFR decline as a surrogate end point for kidney failure: a post hoc analysis from the reduction of End points in non-insulin-dependent diabetes with the angiotensin II antagonist losartan (RENAAL) study and irbesartan diabetic nephropathy trial. Am J Kidney Dis 63:244–250 55. Coresh J, Turin TC, Matsushita K, Sang Y, Ballew SH, Appel LJ, Arima H, Chadban SJ, Cirillo M, Djurdjev O, Green JA, Heine GH, Inker LA, Irie F, Ishani A, Ix JH, Kovesdy CP, Marks A, Ohkubo T, Shalev V, Shankar A, Wen CP, de Jong PE, Iseki K, Stengel B, Gansevoort RT, Levey AS, CKD Prognosis Consortium (2014) Decline in estimated glomerular filtration rate and subsequent risk of end-stage renal disease and mortality. JAMA 311:2518–3251 56. Lambers Heerspink HJ, Tighiouart H, Sang Y, Ballew S, Mondal H, Matsushita K, Coresh J, Levey AS, Inker LA (2014) GFR decline and subsequent risk of established kidney outcomes: a meta-
57. 58. 59. 60. 61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
analysis of 37 randomized controlled trials. Am J Kidney Dis 64: 860–866 Schnaper HW, Furth SL, Yao LP (2015) Defining new surrogate markers for CKD progression. Pediatr Nephrol 30:193–198 Fleming TR, Powers JH (2012) Biomarkers and surrogate endpoints in clinical trials. Stat Med 31:2973–2984 Thompson A (2012) Proteinuria as a surrogate end point–more data are needed. Nat Rev Nephrol 8:306–309 Redon J, Martinez F (2012) Microalbuminuria as surrogate endpoint in therapeutic trials. Curr Hypertens Rep 14:345–349 Lambers Heerspink HJ, de Zeeuw D (2010) Debate: PRO position. Should microalbuminuria ever be considered as a renal endpoint in any clinical trial? Am J Nephrol 31:458–461, discussion 468 Glassock RJ (201 0) Debate: CON p osition. Should microalbuminuria ever be considered as a renal endpoint in any clinical trial? Am J Nephrol 31:462–465, discussion 466–467 Wong H, Vivian L, Weiler G, Filler G (2004) Patients with autosomal dominant polycystic kidney disease hyperfiltrate early in their disease. Am J Kidney Dis 43:624–628 Chapman AB, Bost JE, Torres VE, Torres VE, Guay-Woodford L, Bae KT, Landsittel D, Li J, King BF, Martin D, Wetzel LH, Lockhart ME, Harris PC, Moxey-Mims M, Flessner M, Bennett WM, Grantham JJ (2012) Kidney volume and functional outcomes in autosomal dominant polycystic kidney disease. Clin J Am Soc Nephrol 7:479–486 Grantham JJ, Torres VE, Chapman AB, Guay-Woodford LM, Bae KT, King BF Jr, Wetzel LH, Baumgarten DA, Kenney PJ, Harris PC, Klahr S, Bennett WM, Hirschman GN, Meyers CM, Zhang X, Zhu F, Miller JP, Investigators CRISP (2006) Volume progression in polycystic kidney disease. N Engl J Med 354:2122–2130 Chapman AB, Guay-Woodford LM, Grantham JJ et al (2003) Renal structure in early autosomal-dominant polycystic kidney disease (ADPKD): the consortium for radiologic imaging studies of polycystic kidney disease (CRISP) cohort. Kidney Int 64:1035– 1045 Otsuka Pharmaceutical Development & Commercialization, Inc. (2013) Tolvaptan phase 3 efficacy and safety study in ADPKD (TEMPO3/4). Available at: https://clinicaltrials.gov/ct2/show/ NCT00428948. Accessed 10 Mar 2015 Torres VE, Chapman AB, Devuyst O, Gansevoort RT, Grantham JJ, Higashihara E, Perrone RD, Krasa HB, Ouyang J, Czerwiec FS, TEMPO 3:4 Trial Investigators (2012) Tolvaptan in patients with autosomal dominant polycystic kidney disease. N Engl J Med 367: 2407–2418 Lawrence J, Thompson A (2013) NDA 204441 Tolvaptan clinical and statistical findings, cardiovascular and renal drugs advisory committee meeting, August 5, 2013. Available at: http://www.fda.gov/ downloads/AdvisoryCommittees/CommitteesMeetingMaterials/ Drugs/CardiovascularandRenalDrugsAdvisoryCommittee/ UCM364582.pdf. Accessed 10 Mar 2015 Kawano H, Muto S, Ohmoto Y, Iwata F, Fujiki H, Mori T, Yan L, Horie S (2014) Exploring urinary biomarkers in autosomal dominant polycystic kidney disease. Clin Exp Nephrol. doi:10.1007/ s10157-014-1078-7 Kocyigit I, Eroglu E, Orscelik O, Unal A, Gungor O, Ozturk F, Karakukcu C, Imamoglu H, Sipahioglu MH, Tokgoz B, Oymak O (2014) Pentraxin 3 as a novel bio-marker of inflammation and endothelial dysfunction in autosomal dominant polycystic kidney disease. J Nephrol 27:181–186 Boertien WE, Meijer E, Li J, Bost JE, Struck J, Flessner MF, Gansevoort RT, Torres VE, Consortium for Radiologic Imaging Studies of Polycystic Kidney Disease CRISP (2013) Relationship of copeptin, a surrogate marker for arginine vasopressin, with change in total kidney volume and GFR decline in autosomal dominant polycystic kidney disease: results from the CRISP cohort. Am J Kidney Dis 61:420–429
Pediatr Nephrol 73.
Kistler AD, Serra AL, Siwy J, Poster D, Krauer F, Torres VE, Mrug M, Grantham JJ, Bae KT, Bost JE, Mullen W, Wüthrich RP, Mischak H, Chapman AB (2013) Urinary proteomic biomarkers for diagnosis and risk stratification of autosomal dominant polycystic kidney disease: a multicentric study. PLoS ONE 8:e53016 74. Torres VE, King BF, Chapman AB, Brummer ME, Bae KT, Glockner JF, Arya K, Risk D, Felmlee JP, Grantham JJ, GuayWoodford LM, Bennett WM, Klahr S, Meyers CM, Zhang X, Thompson PA, Miller JP, Consortium for Radiologic Imaging
75.
Studies of Polycystic Kidney Disease (CRISP) (2007) Magnetic resonance measurements of renal blood flow and disease progression in autosomal dominant polycystic kidney disease. Clin J Am Soc Nephrol 2:112–120 Helal I, Reed B, Schrier RW (2012) Emergent early markers of renal progression in autosomal-dominant polycystic kidney disease patients: implications for prevention and treatment. Am J Nephrol 36:162–167