Objective Scoring Systems for Disease Activity in Autoimmune Bullous Disease

Objectively evaluating disease activity in autoimmune bullous disease (AIBD) is important in terms of the clinical assessment of patients and as an outcome measure for clinical trials. Measures need to be general enough to capture the issues specific to each of the bullous dermatoses but specific enough to capture any changes to disease status for a patient. Different tools have been put forward over the last 15 years, but presently the Autoimmune Bullous Skin Disorder Intensity Score and Pemphigus Disease Area Index seem to be the most promising tools to assess disease activity in AIBD.

Autoimmune bullous disease (AIBD) encompasses a range of rare dermatoses such as pemphigus and bullous pemphigoid characterized by the development of autoantibodies directed against keratinocyte epitopes. The clinical presentation of AIBD is heterogeneous, but all manifest by the formation of bullae and erosions. Due the rarity of these diseases, there is a paucity of studies in the literature guiding clinicians in the optimal evidence-based management of AIBD. A recent systematic review of outcome measures in pemphigus over the past 25 years identified 116 different measures across 96 articles. Most studies employ outcome measures based on nonvalidated subjective or nonspecific ratings of disease activity and, because of the variety of measures employed (including lesion counts, complete healing of lesions, duration of remission, number of recurrences, etc), a significant obstacle in comparing therapeutic modalities is the absence of uniform outcome measures across studies for correlation. This is particularly relevant because of the rarity of AIBD, which means that studies often have lower patient numbers with meta-analysis combing studies only possible if uniform outcome measures are available. A number of scoring systems have been developed in recent years in an attempt to provide an objective measure to assess disease activity in AIBD. Such measures function alongside the clinical assessment of patients, allowing objective assessment of disease activity in a patient and the monitoring of disease trajectory over time. These measures can also be used as outcome measures in clinical trials to quantify the effectiveness of clinical intervention.

One of the earliest disease activity scores employed was the Pemphigus Area and Activity Score (PAAS). The PAAS divides the body into four divisions (head, trunk, upper limbs, and lower limbs). Each division is assigned a score based on the number of new blisters, extension of existing blisters, and the presence of the Nikolsky sign; then multiplied by the area involved and an index, with the four scores then totaled. Patients with mucosal involvement are also assigned a mucous membrane score by adding the number of mucosal sites involved to a severity score to ascertain a mucous membrane score. As one of the earliest scoring systems, the PAAS was one of the first objective tools to be employed in the setting of AIBD. It has the advantage of scoring cutaneous and mucosal lesions separately. Additionally, it employed the Nikolsky sign, which is a sensitive but not necessarily specific marker of disease activity. However, the system was limited by the fact that scores were weighted heavily by the area of skin involvement and the inability to detect small changes in disease activity. Furthermore, the tool was not truly objective as severity was assessed with variables graded “mild,” “moderate,” and “severe” dependent on the user’s discretion. Last, lesion counting is notoriously inaccurate when there are a high number of lesions assessed by intra-rater and inter-rater validity studies, and such studies were not performed on this system.

Another scoring system employed in a study reviewing the incidence of remission in pemphigus patients graded disease activity on a score from 0 to 10. Extent was scored from 0 to 4 depending on whether predefined areas of the body were affected and therapy was scored from 0 to 6 depending on the dose of oral corticosteroid and the need of adjuvant immunosuppression. Although this system had the advantage of considering therapeutic data, this was also overrepresented in its use as a proxy for disease activity with clinical information only reviewed superficially and the scoring system never validated.

Another study investigating the correlation between antibody titers and clinical severity in pemphigus used a simple arbitrary scale of 0 to 3 to assess cutaneous and oral lesions. Skin lesions were graded as quiescent (no lesions), mild (<5 discrete lesions), moderate (6–19 discrete lesions), or severe (>20 lesions or extensive confluent erosions). Oral lesions were graded as quiescent (no lesions), mild (≤3 erosions), moderate (4–9 erosions or general desquamative gingivitis), or severe (≥10 lesions or extensive, confluent erosions or generalized desquamative gingivitis with discrete erosions at other sites). This simple tool allows an expedient and objective assessment of disease activity but not with the sensitivity required to effectively detect changes in disease status. More importantly, it relies on the number of blisters present, rather than the size of each blister, the area involved, or the severity of the lesions. Again, no independent validity assessments of this score were performed.

A retrospective case series of pemphigus patients stratified disease severity into four categories based on body surface area (BSA) involvement and functional impairment. Patients were classified as having mild disease (≤10% BSA involvement or disease limited to oral mucosa, ability to carry out activities of daily living [ADL] without discomfort), moderate (10%–25% BSA involvement, able to carry out ADL with discomfort), severe (25%–50% BSA involvement and oral involvement, unable to carry out ADL), or extensive (>50% BSA involvement with mucosal involvement, bedridden or has complications). Whereas this was useful for a gross assessment of disease severity, much like earlier scoring systems, the categories are obviously too broad to accurately detect changes to pemphigus activity and were not constructed with this aim. Furthermore, disease extent was combined with a functional scale and the two variables did not necessarily correlate.

A novel approach to assessing disease activity in oral pemphigus was devised by Saraswat and Kumar. The scoring system is divided into two parts: assessing the extent of disease and the consequent functional impairment. Eleven sites in the oral cavity (the upper and lower labial mucosa, upper and lower gingival mucosa, left and right buccal mucosa, dorsal and ventral lingual mucosa, hard and soft palate, and uvula) are assigned a score of 0 or 1 depending on the presence or absence of lesions, regardless of severity. A functional assessment is determined using a modified version of the validated grading system employed in gastroenterology, which utilizes information relating to the frequency of pain and bleeding with nine different food groups to evaluate functional impairment. The investigators put forward that, because oral lesions tend to coalesce, lesion counts are not a valid measure in this setting. Additionally, patients can show an improvement in odynophagia and their ability to eat solids without gross changes to lesion appearance and so a consideration of function is warranted. Obviously, this tool is constrained by the fact that it is limited to oral lesions and of little utility to patients without oral involvement; nevertheless, it paved the way for future validated scoring systems.

Part of the difficulty in devising an objective scoring system for AIBD is the clinical heterogeneity amongst the blistering dermatoses. A tool needs to be general enough to capture the areas predominating in each blistering disease but specific enough to identify changes in a given patient’s disease activity. The Autoimmune Bullous Skin Disorder Intensity Score (ABSIS) was devised by Eming and Hertl in Germany, with the aim of developing a scoring system sensitive enough to capture such changes in disease activity in AIBD. Scoring is based on the extent of BSA involved and the quality of the skin lesions. The extent of BSA affected is estimated using the “rule of nines” in which defined areas of the body are equivalent to 9% or multiples of 9%. This value is then multiplied by an index reflecting the most dominant lesion present: 1.5 (erosive, exudative lesions, bullae, or Nikolsky sign positivity), 1.0 (erosive, dry lesions), or 0.5 (reepithelialized lesions) ( Fig. 1 ). Oral involvement is evaluated using a modified version of the grading system employed by Saraswat and Kumar. These results are tabulated to give a total score ranging from 0 to 206: 150 for skin involvement, 11 for oral involvement, and 45 for oral functional impairment. The ABSIS score has had two validity studies performed on it.