Evidence-Based Medicine

(1)

Swanson Center, Leawood, KS, USA

Abstract

Cosmetic breast surgery is popularly perceived as artistic. Unfortunately, this notion has allowed nonscientific concepts to persist, without proper scrutiny to establish validity. Without measurements, there is no means to test the effectiveness of surgical methods.

Existing level of evidence scales benefit from modification to include important methodological considerations. Randomization is impractical for elective surgery. However, well-done observational studies can be just as useful. Consecutive patients are needed to avoid selection bias. Prospective studies are initiated before the data are collected, not after. A prospective study among consecutive patients meeting eligibility criteria, with a reported inclusion rate, the use of contemporaneous controls when indicated, and consideration of confounders, is a realistic goal. Such measures are likely to improve study quality. Commercial bias is an endemic problem in medicine. A plastic surgeon may function as a highly paid consultant or as an impartial investigator, but not both.

Patient-reported outcomes are essential in plastic surgery because patient satisfaction is the most important determinant of surgical success. Unfortunately, plastic surgeons are not in the habit of soliciting their patients’ opinion regarding the result. A proprietary psychometric test, known as the BREAST-Q, has limited clinical usefulness. Ad hoc surveys provide useful clinical information that can be used to compare operations. There is no better education than performing outcome studies on one’s patients.

Keywords

Evidence-based medicineMeasurementsCommercial biasPatient-reported outcome studiesConsecutive patientsRandomizationObservational studiesMethodologyStudy design

It is almost taken for granted today that plastic surgeons are artists [1]. Our textbooks are often titled “The Art of Plastic Surgery .” Plastic surgery offices may resemble fine art galleries. With some hubris, plastic surgeons cultivate the public perception that we are artists [2]. Goldwyn [2], longtime former editor of Plastic and Reconstructive Surgery , joked about wishing he were wearing a beret and a paint-spotted frock when asked by a patient if he paints in his spare time.

A recent editorial asks plastic surgeons: which type of artist are you, Michelangelo or Da Vinci? [3]. In reality, the talents of these Renaissance artists might not have been well suited for surgery, which is an empirically based discipline with little use for Neo-Platonism . Being one with a universal force is of limited practical use when it comes to deciding how far to undermine a flap or how much fat to inject. No doubt these legends would have lacked humility, a quality bestowed by the hard experience of surgery, which imposes its own set of limitations and unpredictability on the outcome.

Importantly, neither Michelangelo nor Da Vinci was trained in the scientific method. Michelangelo rejected schooling [4]. Guided by a mystical Neo-Platonic philosophy that was in vogue in Florence at the time, Michelangelo famously claimed that he was releasing the beings captured within the stone [5]. Great as he was, few surgeons would want Michelangelo to be their surgeon, chipping away and trying to liberate a human form in their body, believing he was uniquely touched by genius and divinely inspired [1]. It is not reassuring that Michelangelo had no use for measurements, perhaps explaining why David’s hands, particularly the right hand, are disproportionately large, or perhaps that was intentional (at least that is the contemporary spin) [6]. Unfortunately, by considering themselves artists, plastic surgeons may think that evidence-based medicine does not apply to aesthetic surgery. They may believe, if Michelangelo did not measure his results, why should I? [7]

Galileo, a century later, would finally decouple religion and science, famously saying that God would not have given him the capacity for reason if not for him to use it. In doing so, he helped create the scientific method. Remarkably, Galileo had the insight to reject institutional authority, the humility to subject his ideas to experiments, the diligence to see them through, and the courage to risk his life defending unorthodox findings [1]. Galileo revealed the limitations of intuition. For example, it seemed clear to everyone that a heavier object would fall to the ground faster than a light one. Galileo’s experiments disproved that popular notion [8].

Artists rely on their intuition as a guide. Scientists are trained to question it, aware that the road to ruin is paved with good intuitions. The famed seventeenth-century mathematician and philosopher René Descartes famously commented that doubt is the origin of wisdom [9]. For example, it may be intuitive that manipulating breast tissue can improve upper pole fullness. Only measurements can prove otherwise [10]. Clinical decision-making based on intuition and first principles remains common today, and the need for scientific validation is no less than it was four centuries ago. Ultimately, intuition must give way to the facts.

Turning to one’s inner psyche for guidance in surgery is dangerous and in fact bound to fail, humans being inherently imperfect. We need the scientific method to guide the way. Just as we want our pilots to have good instincts, we also want them to have an altimeter. It is sobering to review our literature and consider how many surgical techniques that were conceived in creative bursts remain grounded because of a lack of scientific validation. In mammaplasty , the number exceeds 100 [10]. Apathy toward science, or a willingness to let the science be outsourced, has real consequences for patients.

Art and science may not be mutually exclusive, but there is an essential difference. An artist uses a medium as a form of self-expression. A scientist seeks to uncover knowledge (and arguably beauty) that already exists, while imparting none of his or her own prejudices regarding what that should be [1]. Plastic surgeons are not really sculptors; we do not fashion marble into an artistic rendering. Our job is to model tissues to improve upon an existing template (cosmetic surgery) or to reconstruct one that has been made deficient through birth, disease, or trauma (reconstructive surgery). We are renovators, not creators. Plastic surgeons may have more in common with the restorers of the Sistine Chapel ceiling than with its creator [1]. Most of us would prefer our surgeon to be respectful of the innate beauty of the human form and not to be inspired to stamp his or her signature on it. Few people would like their nose to be recognized as the work of a particular surgeon [1].

As a product of creativity and imagination, innovation is celebrated [11]. New or repopularized techniques find an audience at meetings. So what is missing? Measurements. Without measurements, no rejuvenation concept is ever proved and none is disproved either, a sort of therapeutic purgatory.

Without measurements, no rejuvenation concept is ever proved and none is disproved either, a sort of therapeutic purgatory.

Saying that numerous techniques can deliver the same result is a familiar throwaway line at meetings. As scientists, we do not really believe that, do we? Perhaps it is more accurate to say that without measurements there is no way to ever know. Often the less scientific merit for a claim, the more passionate the proponent. Such claims often follow the lead-in, “I’m a firm believer that…” [1]

Some plastic surgeons suggest that our specialty is too subjective to permit scientific evaluation [12]. In truth, there is always a way of measuring if one puts one’s mind to it. Claiming that because plastic surgery is an art, or because it is aesthetic, evidence-based medicine does not apply is no excuse for not measuring. The old axiom applies: what we measure, we improve (and the opposite is true too) [1]. Fortunately, computer imaging has made photographic standardization and measurements easy to perform. Gillies, who reportedly said that the camera was the most important advance in the history of plastic surgery [13], might feel the same way about the computer if he were with us today. Examining one’s consecutive, standardized photographs is an educational experience for which there is no substitute. After doing so, plastic surgeons might be less inclined to promote a “natural breast implant” or an “internal bra.”

Plastic surgeons attend medical school, and not a fine arts academy, for a reason [6, 7]. We need to rededicate ourselves to the scientific method. We need to use a ruler (or its computerized analog) along with a scalpel [6].

Certainly, innovation gives us a competitive advantage [11]. However, so does our professionalism. A commitment to the truth and a resistance to marketing pressures help distinguish plastic surgeons from the wannabes. If we insist on being artists, we risk separating ourselves further from the medical mainstream [14]. No, it is not time to reconsider plastic surgery as a fine art. Cross-training is fine; the importance of an appreciation for aesthetics is unquestioned. But let us not forget our medical foundation.

Evidence-Based Cosmetic Breast Surgery

Until now, no publication has been published with the words “evidence based” and “cosmetic breast surgery ” in the same title. The problem is, “evidence based” has become a cliché. “Evidence-based medicine” is a phrase coined by Guyatt in 1991 [15]. Sackett et al. [16] defined evidence-based medicine as “the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients through integrating clinical expertise with the best available external clinical evidence from systematic research and the patient’s values.” This definition is subject to interpretation as to what exactly constitutes the best available clinical evidence.

In reviewing the plastic surgery literature, it would appear that evidence-based medicine was introduced to plastic surgery in about 2009 [17, 18]. However, physicians have known about the importance of rigorous methodology and study design for decades. These are not new concepts. They have simply been neglected. For example, Brody and Latts [19], in 1981, wrote “established techniques for the conduct of drug trials are well-described in the literature, but none of our plastic surgery writing on this subject betrays any familiarity with a controlled study.” In discussing the etiology of capsular contracture, the authors [19] called for prospective studies, concurrent controls, and reproducible diagnostic criteria. They emphasized the need for “well-established, scientifically valid analysis rather than artistic ‘impressionism’” (i.e., conclusions based on clinical impressions) [19].

Scientific study of cosmetic breast surgery has suffered from a lack of accepted definitions and terms relating to breast shape, and a practical measurement system. There has been a noticeable reluctance to use measurements, or even to standardize photographs [20]. As I took my seat after a presentation at the 2016 meeting of the American Society of Plastic Surgeons, a senior co-panelist leaned to me and whispered, “too scientific.” The irony is that I do not make detailed measurements that I follow precisely in surgery. I rarely adhere exactly to my preoperative markings. My final decision regarding nipple placement is made in surgery. I do not use tissue measurements to determine implant size. The time I spend making markings is just a few minutes. The system I developed is used after surgery , for comparison of before-and-after photographs using the same reference plane [21]. It is a means to evaluate and quantify surgical changes later, when I have an opportunity in my office to match photographs and make the measurements. This analysis is the foundation of my work to scientifically evaluate cosmetic breast surgery (Fig. 1.1). In many ways, evidence-based cosmetic breast surgery is measurement-based cosmetic breast surgery.

Fig. 1.1

Studies published by the author in cosmetic breast surgery patients

Levels of Evidence

The lack of science in plastic surgery is well recognized [17, 22–25]. Efforts to incorporate evidence-based medicine [15, 26] in plastic surgery are justified. Both the Level of Evidence [27] and Grade [23] concepts originated in a seminal Canadian Task Force Report published in 1979 [28]. Evidence-based medicine challenges traditional clinical practice based on unsystematic clinical observations, basic principles, common sense, experience, and expert opinions [16, 26, 29, 30]. Ironically, the Level of Evidence classification [27] itself is a product of experience and expert opinion. Evidence-based medicine is not intended to be static, but rather a dynamic, lifelong process [30, 31] that recognizes the need to evolve [16]. There is no grandfather clause that shields it from scientific scrutiny [32]. When analyzed, medical practice guidelines often fall short in meeting methodological standards [32]. About half the guidelines are outdated in 6 years [33].

Evaluating Evidence-Based Medicine in Plastic Surgery

In 2013, the author used the components of evidence-based medicine [24, 30], including “tracking down the best evidence” and “critically appraising that evidence” to investigate evidence-based medicine in plastic surgery [34]. A 2-year period of cosmetic surgery publications in the Journal of Plastic and Reconstructive Surgery, July 2011 through June 2013, was retrospectively evaluated. All articles with a Level of Evidence rating published in the Cosmetic Section were included. Each paper was designated a quality rating by the author using a new Cosmetic Level of Evidence And Recommendation (CLEAR) scale (Table 1.1). This classification modifies the traditional Level of Evidence ranking [7] and grade of recommendation (Table 1.2) [17, 23–25]. Table 1.3 and Fig. 1.2 compare the classifications. Table 1.4 summarizes the findings.

Table 1.1

Cosmetic Level of Evidence and Recommendation (CLEAR): description of levels and recommendations

Level	Description	Recommendation
1.	Randomized trial with a power analysis supporting sample sizes.	A
2.	Prospective study, high inclusion rate (≥80%), and description of eligibility criteria. Objective measuring device (i.e., not surgeon’s opinion) or patient-derived outcome data. Power analysis if treatment effect is compared. No control or comparative cohort is needed if effect is profound.	A
3.	Retrospective case-control study using a contemporaneous control group. Prospective clinical study with an inclusion rate <80%. Prospective study without controls or comparison group and a treatment effect that is not dramatic.	B
4.	Retrospective case series of consecutive patients. Case-control study using historical controls or controls from other publications. Important confounder that might explain treatment effect.	C
5.	Case report, expert opinion, nonconsecutive case series.	D

Reprinted from Swanson [34]. With permission from Wolters Kluwer Health

Table 1.2

Grade of recommendation

A	Conclusion strongly supported by the evidence, likely to be conclusive
B	Conclusion strongly supported by the evidence
C	Moderate support based on the evidence
D	Inconclusive based on the evidence presented

Reprinted from Swanson [34]. With permission from Wolters Kluwer Health

Table 1.3

Comparison of Level of Evidence (LOE) and Cosmetic Level of Evidence and Recommendation (CLEAR) criteria

Parameter	PRS LOE	CLEAR
Study design
Randomization	✓	✓
Prospective vs. retrospective	✓	✓
Control or comparative cohort	✓	✓
Methodology
Consecutive patients		✓
Power analysis		✓
Eligibility criteria		✓
Inclusion rate		✓
Important confounder		✓
Dramatic effect		✓

Reprinted from Swanson [34]. With permission from Wolters Kluwer Health

PRS Plastic and Reconstructive Surgery Journal

Fig. 1.2

Comparison of the assigned Level of Evidence (LOE) and CLEAR Grade for 87 consecutive studies published in the Cosmetic Section of Plastic and Reconstructive Surgery from July 2011 to June 2013. Two studies were unratable because of study error (Reprinted from Swanson [34]. With permission from Wolters Kluwer Health)

Table 1.4

Study characteristics by CLEAR rating

Study parameter	2A (%)	3B (%)	4C (%)	5D (%)	All studies (%)
No. of studies	3	8	30	44	85
Design
Randomized	0 (0)	0 (0)	1 (3.3)	3 (6.8)	4 (4.7)
Prospective	3 (100)	5 (62.5)	2 (6.7)	17 (38.6)	27 (31.8)
Comparative cohort	1 (33.3)	5 (62.5)	5 (16.7)	10 (22.7)	21 (24.7)
Control	1 (33.3)	2 (25.0)	1 (3.3)	9 (20.5)	13 (15.3)
Methodology
Consecutive patients	3 (100)	8 (100)	30 (100)	0 (0)	41 (48.2)
Power analysis	1 (33.3)	1 (12.5)	0 (0)	1 (2.3)	3 (3.5)
Description of inclusion criteria	3 (100)	8 (100)	29 (96.7)	19 (43.2)	59 (69.4)
Inclusion rate provided	3 (100)	7 (87.5)	21 (70.0)	11 (25.0)	42 (49.4)
Confounders	1 (33.3)	7 (87.5)	24 (80.0)	33 (75.0)	65 (76.5)
Inclusion rate, %
Mean	89.4	78.9	81.9	54.5	75.1
SD	10.0	14.9	26.4	42.3	30.9
Range	80–100	65.3–100	23.6-100	1.5–100	1.5–100
Sample sizes
Mean	150.3	612.8	371.1	332.1	361.8
SD	105.6	962.0	761.4	759.8	754.2
Range	30–225	20–2971	9–3636	5–3800	5–-3800
Other
Discussion of limitations	3 (100)	3 (37.5)	16 (53.3)	19 (43.2)	41 (48.2)
Commercial bias	0 (0)	0 (0)	4 (13.3)	8 (18.2)	12 (14.1)
Discussion accompanying article	0 (0)	5 (62.5)	9 (30.0)	8 (18.2)	22 (25.9)

Reprinted from Swanson [34]. With permission from Wolters Kluwer Health

Forty-eight studies (55%) were designated a Level 4 using the Journal of Plastic and Reconstructive Surgery’s Level of Evidence rating . Three articles were assigned a Level 1. Forty-one articles (48%) evaluated consecutive patients or consecutive patients subject to inclusion criteria. Thirty-five studies (40%) consisted of chart reviews and a recording of complication and reoperation rates. Twenty-five studies (29%) reported physical measurements on patients or images. An equal number of studies (29%) featured subjective evaluations of the result by the investigators. Patient-derived data were collected in 18 studies (21%).

Levels of Evidence Hierarchy

A Level 1 study is often considered the “gold standard” of evidence [16, 29, 35, 36]. A Grade A recommendation is usually assigned to such studies [24, 31]. A Level 5 study, on the other hand, constitutes expert opinion that is often open to question. A Level 2 study is a prospective comparison of treatment cohorts, a Level 3 study is a retrospective case-control study, and a Level 4 study is a case series [24].

Grade (A–D) Recommendation

The present grade classification used by the Journal [24] provides recommendations based on current knowledge irrespective of the study. A deficient study could receive an “A” grade if there are existing high-level studies that support its conclusion. The CLEAR grade rates the overall quality of the study itself, regardless of conventional wisdom [34]. A low-quality study that concludes, for example, that smoking increases the complication rate may receive a low grade of recommendation, despite support in the literature. Because methodology is considered in the CLEAR numerical rating (1–5), the grade tends to be closely linked. In this study, the CLEAR Level and Grade always matched (2A, 3B, 4C, and 5D). The traditional Level of Evidence rating does not correlate well with the recommendation grade (ρ = 0.11, not significant) because it does not consider several important quality parameters (Table 1.3).

The traditional Level of Evidence rating does not correlate well with the recommendation grade because it does not consider several important quality parameters.

Level 1 Studies

Only three studies were designated Level 1. Paradoxically, all three Level 1 studies arrive at unreliable conclusions that encourage the reader to needlessly (1) purchase a six-figure instrument [37, 38], (2) compromise the aesthetic result of an abdominoplasty [39, 40], and (3) deny surgery to one-third of prospective cosmetic rhinoplasty patients [41, 42]. These three Level 1 studies represent just 3% of the total number of publications, equal to the percentage of Level 1 studies published in three major plastic surgery journals from 1998 to 2007 [29]. The frequency of highest-level studies does not appear to be increasing as hoped [29, 36]. It is reasonable to ask whether a randomized trial (the additional descriptors, “controlled” and “prospective,” are redundant) is the ideal model [34].

Randomized Trials and Cosmetic Surgery

Randomized trials balance known and unknown confounders and avoid selection bias [17, 43]. In drug-testing, the need to identify a true benefit from a medication, without the influence of other factors, is well known. However, surgery is a much different discipline [29, 44–47].

Unlike a pill, a procedure is not identical from patient to patient [29, 48], placebos and blinding are usually not possible, and randomization is not well accepted by patients [29, 35, 43], surgeons [35, 43, 47], or referral sources [45]. Patients are particularly averse to randomization when the choice involves an operation with irreversible consequences [35, 36, 49]. Solomon and McLeod [50] report that most surgical questions would

Patients are particularly averse to randomization when the choice involves an operation with irreversible consequences.

not be suitable for randomized trials, citing patient resistance, uncommon conditions, and lack of clinical equipoise as the most common reasons. Other shortcomings include a lack of external validity (generalizability) [17, 18, 43, 49], the fact that surgeons are rarely equally proficient in and enthusiastic about two different techniques [46, 49], and cost [18, 43]. Funding is an issue for cosmetic surgeons in practice [35]. Such studies need to be cost effective [50]. Lack of funding can lead to methodological compromises [51]. Randomized trials suffer from low inclusion rates and recruitment biases, and may be underpowered [18, 49]. In surgery, by the time a randomized trial is conducted, the novel procedure has often been improved [45]. Techniques evolve quickly, particularly in plastic surgery [46].

In recent years, the presumed supremacy of the randomized controlled trial has been challenged [34, 49]. Two review articles published in the New England Journal of Medicine showed that observational studies usually produce results similar to randomized trials, and may be more consistent and less prone to reporting contradictory results [53, 54]. Their greater homogeneity provides a broader representation of the general population [53].

Randomized trials are inflexible and do not allow modifications that might better suit individual patients [34]. Inadequate concealment of randomization and treatment assignments can cause serious bias that may exceed the magnitude of the treatment effect [56–58]. Bhandari et al. [52] report that two-thirds of randomized orthopedic trials did not use proper techniques of randomization or concealment. Reviews of randomized trials in plastic surgery uniformly report low quality [36, 59–63].

Randomized trials are beyond the capability of most plastic surgeons [34]. Fortunately, well-done observational studies can work as well or better [34]. Important considerations

Fortunately, well-done observational studies can work as well or better.

include a prospective study design, controls, and sound methodology, including consecutive patients, high inclusion rates, clear eligibility criteria, and consideration of confounders [34]. Because observational studies are less expensive than randomized trials, there is less need for outside funding, which avoids commercial bias – a major problem in plastic surgery today [55].

The CLEAR (Cosmetic Level of Evidence And Recommendation ) classification includes important methodological criteria that are left out of the existing Level of Evidence classification and a grade system that rates the reliability of a study based on its merits rather than whether the conclusions are supported by the literature [34]. A Grade A recommendation is now shared by randomized and high-level observational studies. The CLEAR classification preserves the same categories from Level 1 to Level 5, but adds overdue modifications [34]. This process is simply the application of the principles of evidence-based medicine to actual evidence-based medicine [34, 49].

The CLEAR (Cosmetic Level of Evidence And Recommendation) classification includes important methodological criteria that are left out of the existing Level of Evidence classification.

Equipoise

Ethical considerations prohibit randomization of patients into two groups, one of which constitutes a known inferior treatment [43]. Cognitive dissonance may prevent a surgeon from finding that one half of his or her randomized patients received an inferior treatment [34, 44]. The investigator may be confronted by a catch-22 [34]. If the surgeon does not believe there is an advantage for the newer method, why is he or she conducting the study in the first place? For example, two studies compared different facelift techniques on each side of the face in the same patient [64, 65]. If the investigators had found that one facelift method was superior, they would be also conceding that one side was treated inferiorly. Not surprisingly, the authors found the techniques to be similarly effective, avoiding this ethical dilemma. If the difference is so slight that there is no consistent evidence one way or the other, the study is probably pointless.

Only gold members can continue reading. Log In or Register to continue