Introduction
Outcome measures play an important role in the field of dermatology, both in routine clinical practice and in clinical research trials. In clinical settings, robust outcome measures serve to classify disease severity, which can direct treatment choice. Outcome measures can also assess changes in disease severity over time. Furthermore, these measures provide a common language between providers that can allow for effective communication of a patient’s status, eliminating the subjectivity and ambiguity that arise from the use of clinical descriptors alone. In research settings, outcome measures allow for the systematic assessment of the efficacy of novel therapeutic interventions. Ideally, disease-specific outcome measures designed for use in clinical trials and applied uniformly between studies also make comparison of different treatment modalities between possible studies.
Characteristics of Well-Designed Outcome Measures
Optimal outcome measures for clinical settings must allow for proper discrimination of disease severity. Measures that do not meet this specification may not detect small—but important—changes in disease. Ideally, the outcome measure should demonstrate ease-of-use and have no significant learning curve. Outcome measures for clinical trials may be impractical for busy clinical settings. Outcome measures can be clinician-reported, which assess signs of disease activity, or can be patient-reported, which assess pain or quality of life. Both can support individualized care. Composite measures also combine clinician-reported and patient-reported items into one scale. Poorly designed outcome measures have negative consequences on both routine patient care and the results of clinical trials/drug discovery processes ( Table 13.1 ).
Outcome Measure Characteristic | Description | Consequence(s) of Poorly Designed Measure |
---|---|---|
Discrimination of disease severity | Ability to validly and reliably classify disease magnitude |
|
Responsiveness to change | Ability to detect meaningful small and large changes in disease severity over time |
|
Reproducibility | Adequate interrater and intra rater reliability |
|
The Hidradenitis Suppurativa Core Outcomes Set International Collaboration (HISTORIC): Addressing Current Challenges in Outcome Measure Development and Utilization
Hidradenitis suppurativa (HS) is a complex inflammatory disease which causes significant morbidity in affected patients. Well-designed outcome measures are needed to support an efficient drug discovery and approval process. Currently, there is a wide variety of outcome measures for both clinical and trial use. Many of these instruments are plagued by various shortcomings, including lack of precision, responsiveness to change, or patient-centeredness. The use of various instruments across clinical trials has led to the systematic introduction of bias and makes comparison of results across studies difficult or impossible.
In an effort to alleviate this problem, an international group of researchers, clinicians, and patient research partners are collaborating to develop an agreed-upon Core Outcome Set to be used in all clinical trials, examining possible treatment options for HS. This ongoing multimodal effort—The Hidradenitis Suppurativa Core Outcomes Set International Collaboration (HISTORIC)—is working to develop a set of core outcome measures which pertains to all stakeholders, including patients, dermatologists, surgeons, nurses, industry, and regulatory authorities. The core domains include (1) disease course; (2) physical signs; (3) HS-specific quality of life; (4) symptoms; (5) pain; and (6) Global assessments. This ongoing international effort holds the promise of homogenizing outcome measures between trials to improve clinical trial design and ensure proper assessment of pertinent outcomes.
Current Outcome Measures for Hidradenitis Suppurativa
Clinician-Reported Outcome Measures
Hurley Staging System
First described in 1989, the Hurley Staging System (HSS) was originally developed to determine the best surgical approach for individual patients with HS based on the type of lesions in a given anatomic location. However, despite not being the original intent of the scale, the HSS is now the most widespread outcome measure used for routine clinical assessment in HS. Furthermore, the HSS is recommended by the United States and Canadian Hidradenitis Suppurativa Foundations for routine use in clinical settings due to its ease of application and the existence of therapeutic ladder recommendations based on the scale.
The HSS categorizes patients into one of three groups based on their most severe area of involvement ( Table 13.2 ). Hurley Stage I disease represents the majority of patients with HS and is defined as the presence of one or more isolated abscesses in the absence of scar or sinus tract formation ( Fig. 13.1 ). Hurley Stage II HS is characterized by the presence of recurrent abscesses with sinus tract formation or cicatrization ( Fig. 13.2 ). Hurley Stage III disease is defined by diffuse or near-diffuse involvement across a body site with multiple abscesses and interconnected sinus tracts ( Fig. 13.3 ).
Stage I | ≥ 1 abscesses |
No sinus tract or scar formation | |
Stage II | ≥ 1 recurrent abscesses with associated sinus tract/scar formation |
Stage III | Diffuse or near-diffuse involvement of affected region |
Multiple abscesses and interconnected sinus tracts | |
Extensive scarring |
The advantages of the HSS include its ease of use and the minimal training required. It is quick to perform and can feasibly be integrated into busy clinical settings without significant difficulty. Furthermore, the moderate-to-good interrater reliability of the HSS. However, the HSS is a static measure initially designed to describe the severity of disease in individual regions in order to determine appropriateness of surgical therapy. As such, it provides a measure of the most severely involved region, rather than an accurate assessment of the overall burden of disease. Additionally, with only three stages of disease, the HSS provides little granularity and demonstrates insufficient responsiveness to change, making it poorly suited for use as an outcome measure in research settings.
Modifications to the Hurley Staging System
A modification of the HSS was developed by the Dutch Hidradenitis Suppurativa Expert Group to further stratify HS into seven overall categories ( Table 13.3 ). Unlike the traditional HSS, the refined HSS incorporates the degree of active inflammation as well as the magnitude of body surface involvement, based on the number of body regions involved. The modifications introduced in this scale expands the stratification of patients traditionally defined as having Hurley Stage I or II disease. The changes may better classify patients who would benefit from systemic therapy and improve detection of small meaningful changes. The refined HSS remains a relatively simple, time-efficient measure that can easily be incorporated into daily clinical practice. However, a disadvantage of this revised scale is that it has fair interrater reliability, so raters may benefit from training or experience.
Sinus tracts? | No | ≤ 2 regions AND < 5 abscesses/nodules | Stage IA | |||
> 2 regions OR ≥ 5 abscesses/nodules | Fixed lesions | Stage IB | ||||
Migratory lesions | Stage IC | |||||
Yes | Interconnected sinus tracts involving ≥ 1% BSA? | No | Inflammation? | No | Stage IIA | |
Yes, ≤ 2 regions | Stage IIB | |||||
Yes, > 2 regions | Stage IIC | |||||
Yes | Stage III |
Inflammatory Lesion Counts
Inflammatory lesion counts—which involve separately determining the number of inflammatory nodules, abscesses, and draining fistulae/sinus tracts in each patient—are a frequent method of assessing HS severity. Although inflammatory lesion counts form the basis of many other HS-specific outcome scales, they can also function in isolation and have been used in this manner in clinical trials. In contrast to the HSS, which is a discrete measure ranging from 1 to 3, inflammatory lesion counts are open-ended with no upper limit. This allows for the detection of small changes in disease activity, as determined by either the resolution of existing lesions or the development of new lesions. However, the interrater reliability of lesion counts were found to be poorly to moderately reliable. In addition, lesion counts are purely clinician-reported and do not take into account important patient-reported aspects of HS including pain; therefore, lesion counts (when used alone) may underestimate the severity of disease from the patient’s perspective. Reliability and error in the scales are important because there is a higher chance with lower reliability that a change in score could be due to measurement error rather than a real change in disease. Furthermore, clinician assessment of lesion counts at predetermined time points (such as routine office visits) will not capture changes in HS severity that occur between evaluations. This section describes several of the lesion-count based outcome measures.
Sartorius Score (Hidradenitis Suppurativa Score)
The Sartorius score (also known as the Hidradenitis Suppurativa Score) was originally proposed as an alternative staging system for HS by Karin Sartorius in 2003 (Sartorius et al., 2003). The original purpose of the instrument was to serve as a uniform outcome measure that could be applied to cohort studies investigating surgical treatment of HS while providing a more dynamic and precise scale compared to the HSS. The Sartorius score is an open-ended scoring system which takes into account four clinician-reported variables ( Table 13.4 ). These include the number of anatomic regions involved, the number and types of lesions, the distance between lesions, and the absence or presence of normal intervening skin between lesions. Each characteristic is attached to an assigned point value, which are summed to calculate an overall score that can track disease severity over time. Due to the dynamic and open-ended nature of the scale, it is more ideal for determining response to treatment than the traditional HSS. For this reason, it has been used as an outcome measurement instrument in clinical trials. Like the HSS, the Sartorius score is relatively easy to combine with patient-reported outcome measures like the Dermatology Life Quality Index (DLQI) and was designed with this purpose in mind. Although the Sartorius scale provides a granular and dynamic measure of disease activity, it can be time consuming. Therefore, additional training may be required and although it has been used successfully in research settings, implementation into ordinary clinical workflow is difficult.
Original Sartorius Score | Modified Sartorius Score | |||
---|---|---|---|---|
Anatomic regions considered | Points (3 per region involved) | Points (3 per region involved) | ||
Axilla | Left | Axilla | Left | |
Right | Right | |||
Groin | Left | Groin | Left | |
Right | Right | |||
Gluteal | Left | Gluteal | Left | |
Right | Right | |||
Inframammary | Left | Inframammary | Left | |
Right | Right | |||
Other | Other | |||
Lesion type | Points (per lesion) | Points (per lesion) | ||
Abscess/nodules | 2 | Nodules | 1 | |
Fistulae | 4 | Fistulae | 6 | |
Scars | 1 | |||
Other | 1 | |||
Longest distance between two relevant lesions | Points | Points | ||
< 5 cm | 2 | < 5 cm | 1 | |
5–10 cm | 4 | 5–10 cm | 3 | |
> 10 cm | 8 | > 10 cm | 9 | |
Lesions clearly separated by normal skin? | Points | Points | ||
Yes | 0 | Yes | 0 | |
No | 6 | No | 9 |
Modified Sartorius Score
The revised Sartorius score, like the original, considers the number and types of lesions present, the distance between lesions, and the absence or presence of normal skin between lesions. However, the revised score involves a simplified assessment scale meant to make the instrument more practical to use by clinicians. For example, the number of lesion types in the revised score has been dichotomized into nodules and fistulae for simplicity (see Table 13.4 ). Furthermore, the number of points assigned to each variable has been altered to allow for more granular comparison of a patient’s clinical status between time points. Although the modified Sartorius scale is still time consuming to perform compared to more traditional methods such as the HSS, after training, it can reliably be performed in about 5 minutes, depending on disease severity.
Hidradenitis Suppurativa Clinical Response
The Hidradenitis Suppurativa Clinical Response (HiSCR) is a validated outcome measure which was specifically developed for the evaluation of anti-inflammatory treatments for HS. The HiSCR was designed retrospectively using data from a Phase II trial. Like Sartorius score, the HiSCR is based on inflammatory lesion counts; however, the HiSCR is a dichotomous outcome scale that divides patients into two groups: HiSCR achievers and non-achievers. HiSCR achievers are defined as those who demonstrate a (1) 50% or more reduction in the number of abscesses and inflammatory nodules, (2) lack of increase in the number of abscesses, and (3) lack of increase in the number of draining fistulae, compared to the baseline counts. The HiSCR was validated against existing HS-specific outcome measures and is relatively simple to use. HiSCR achievement has been shown to correlate both clinically meaningful changes in status as well as improvement in patient-reported outcomes (e.g., reduction in DLQI and visual analogue pain scales). In a Phase II trial of adalimumab in the treatment of HS, the HiSCR demonstrated excellent responsiveness to change in clinical status. However, because the HiSCR represents a dichotomous outcome, it has limited granularity compared to continuous outcome scales and therefore is not useful in further sub-stratifying patients with severe disease in clinical settings, although this was not its original intent. Further, the interrater reliability for abscess and inflammatory nodule counts that serve as the basis of the HiSCR was demonstrated to be relatively low (interclass correlation coefficient 0.44) in a comparative study examining multiple outcome measures.
International Hidradenitis Suppurativa Severity Scoring System
The International Hidradenitis Suppurativa Severity Scoring System (IHS4) is an expert consensus-based scale developed by members of the European Hidradenitis Suppurativa Foundation using the Delphi method. After development of a preliminary scale, a multicenter, prospective validation study of 210 patients was performed in an effort to correlate the results of the IHS4 with other existing outcome measures and to determine strengths and weaknesses of the novel instrument. Following this, a second Delphi voting procedure was undertaken in order to optimize the scale based on the findings from the validation study; from this, the finalized version of the IHS4 was developed. The clinician determines the number of inflammatory nodules, abscesses, and draining tunnels present. Each lesion has a differential weight (nodules x1, abscesses x2, and fistulae/sinuses x4); therefore, after multiplying the number of lesions by the weight, the sum of these is the total score ( Table 13.5 ). The total score can be categorized as: mild (≤ 3 points), moderate (4 to 10 points), or severe (> 10 points) disease. One important feature of IHS4 is that the presence of a single draining tunnel (fistulae/sinus tract) automatically designates a patient as having at least moderate disease.