Clinical practice guidelines (CPG) are “statements that include recommendations intended to optimise patient care. They are informed by a systematic review of evidence and an assessment of the benefits and costs of alternatives care options.”1
In surgery, clinical practice structured according to the principles of evidence-based medicine (EBM) is becoming increasingly important as new technologies evolve, requiring a better understanding of the associated benefits and risks to assist surgeons and patients in their decision-making. In addition, we are living in an era of limited healthcare resources, which requires interventions to be not only safe and effective, but also cost-effective.2
The proliferation of CPGs sometimes adds to the confusion because different authors have published conflicting recommendations for specific clinical questions. For example, directives from different guidelines regarding the effectiveness of mechanical colon preparation in colorectal resection surgery have very different results for the prevention of surgical site infection.3
In an attempt to reduce the great variability in guideline development,4 a working group was formed in 2000 with the intention of developing a standardised method for grading evidence and making recommendations on specific clinical questions. This group developed the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE). This methodology is a process for grading the evidence for a particular question and making evidence-based recommendations (www.gradeworkingroup.org). In the following lines, we will try to briefly describe the process of elaboration of a CPG, following the GRADE methodology.
How to ask the clinical question. The first step in using the GRADE methodology is to change the format of an "informal question" into a specific question that can be adequately answered. For example, an informal question might be "how do I treat a patient with blunt splenic trauma?" or "should I use angioembolisation to treat splenic trauma with active bleeding?" The question should then be phrased in "PICO" format. When formulated correctly, the question should clearly identify the patient population (P), the intervention (I), the comparator (C), and the outcome (O). A question in this format for our example might read: In patients with blunt splenic trauma (P), should angioembolisation (I) be performed compared to no angioembolisation (C) to improve splenic preservation (O) in patients treated with conservative management?" PICO questions serve to develop systematic literature reviews and guideline development.
Defining outcomes. Predefining which outcomes are important is relevant to both the literature search and the guideline development process. To use GRADE, the outcome of each PICO is classified as "critical" for decision making, "important but not critical" or of "limited importance" with respect to decision making. The results can be ranked with a numerical value based on a rating scale from 1 to 9 to describe their importance. Thus, a rating of 7–9 is given for critical outcomes (mortality, reoperation, etc.), 4–6 for important outcomes, and 1–3 for outcomes of limited importance.
Systematic review of publications in the literature. Although this chapter cannot cover the specific details of how to conduct a systematic review, the importance of reliably identifying all relevant published (and unpublished) data is mandatory. A meta-analysis should be used to combine data from different studies to obtain an overall point estimate and confidence interval for the effect size of the intervention on the outcome of interest, if possible.
Classification of the evidence. The next step will be to grade the evidence for each outcome for each PICO question; GRADE describes four levels of evidence quality: "high" (A), "moderate" (B), "low" (C) and "very low" (D), which can be applied to randomised trials or observational studies.5 When grading the quality of evidence, randomised controlled trials (RCTs) are initially considered as high quality evidence (but may be graded downwards), while observational studies start as low quality evidence (but may be graded upwards); thus, GRADE does not only consider study design when grading the quality of evidence, as can be seen in Table 1. In practice, it is more common to downgrade the quality of evidence than to upgrade it.
Factors which later the quality of evidence according to GRADE.
Factors that may lower the quality of the evidence | |
---|---|
Limitations in study design or execution (risk of bias) | ↓ 1 or 2 grades |
Inconsistency between the results of different studies | ↓ 1 or 2 grades |
Availability of indirect evidence | ↓ 1 or 2 grades |
Inaccuracy of effect estimators | ↓ 1 or 2 grades |
Suspicion of publication bias | ↓ 1 grade |
Factors that increase the quality of the evidence | |
Magnitude of important effect | ↑ 1 or 2 grades |
Dose-response gradient | ↑ 1 grade |
Impact of important confounding variables | ↑ 1 grade |
Making recommendations. Once all the evidence has been classified and summarised, the second phase of the process, making recommendations, begins. The strength of the recommendation is classified as strong and weak, so that in strong recommendations, the benefits of an intervention clearly outweigh its risks and all well-informed patients would choose such a treatment and the physician could safely recommend it. Weak recommendations reflect therapies in which the benefits and risks are uncertain or more evenly balanced. For such interventions, the physician should assess the evidence underlying the recommendation and the patient should weigh the treatment options according to his or her preferences.
Future perspectives. Despite the gradual improvement in the quality of CPGs, many experts have suggested further reforms and improvements. Their proposals have included areas such as a change in guideline leadership from one edition to the next, inclusion of an expiry date after which a recommendation should be revised, better representation of alternative interpretations and viewpoints, independent scientific review of guidelines, and a rigorous process for managing conflicts of interest.6 In addition, there are still important aspects to be improved, such as capturing patients' values and preferences, tools to facilitate implementation as an aid to decision-making7 and the involvement of costs and resources in recommendations.8
ConclusionThe GRADE methodology is becoming the most widely used methodological framework for grading the quality of evidence and the strengths of recommendations.9 There are transparent, explicit and comprehensive criteria for downgrading or upgrading the quality of evidence. In addition, there are clear definitions of strong and weak recommendations. Finally, it takes into account the importance of patient outcomes and considers the balance between benefit and risk when formulating recommendations. To maintain their leadership role in guideline development, it would be important that learned societies adopt the GRADE methodology for future CPG development.