FLEX October 2023
Animated publication
Radiation Oncology International Journal of biology physics
www.redjournal.org
Clinical Investigation
Voice Quality After Treatment of Early Vocal Cord Cancer: A Randomized Trial Comparing Laser Surgery With Radiation Therapy Leena-Maija Aaltonen, MD, PhD,* Noora Rautiainen, MA, y Jaana Sellman, PhD, y Kauko Saarilahti, MD, PhD, z Antti Ma¨kitie, MD,* Heikki Rihkanen, MD, PhD,* Jussi Laranne, MD, PhD, x Leenamaija Kleemola, MD, PhD, x Tuija Wigren, MD, PhD, jj Eeva Sala, MD, PhD, { Paula Lindholm, MD, PhD, # Reidar Grenman, MD, { and Heikki Joensuu, MD z *Department of Otorhinolaryngology e Head and Neck Surgery, Helsinki University Central Hospital, and University of Helsinki; y Institute of Behavioural Sciences, University of Helsinki; z Department of Oncology, Helsinki University Central Hospital, and University of Helsinki, Helsinki, Finland; x Department of Otorhinolaryngology e Head and Neck Surgery, Tampere University Hospital, and University of Tampere; jj Department of Oncology, Tampere University Hospital, and University of Tampere, Tampere, Finland; { Department of Otorhinolaryngology e Head and Neck Surgery, Turku University Hospital, and University of Turku; and # Department of Oncology, Turku University Hospital, and University of Turku, Turku, Finland
Received Apr 23, 2014, and in revised form Jun 5, 2014. Accepted for publication Jun 10, 2014.
Summary This first randomized study concerning early laryngeal cancer patient’s voice qual ity showed that patients treated with radiation ther apy or with transoral laser surgery had similar overall voice quality, but radiation therapy resulted in a less breathy voice. Those treated with radiation therapy re ported less voice-related
Objective: Early laryngeal cancer is usually treated with either transoral laser surgery or radiation therapy. The quality of voice achieved with these treatments has not been compared in a randomized trial. Methods and Materials: Male patients with carcinoma limited to 1 mobile vocal cord (T1aN0M0) were randomly assigned to receive either laser surgery (n Z 32) or external beam radiation therapy (n Z 28). Surgery consisted of tumor excision with a CO 2 laser with the patient under general anaesthesia. External beam radiation therapy to the lar ynx was delivered to a cumulative dose of 66 Gy in 2-Gy daily fractions over 6.5 weeks. Voice quality was assessed at baseline and 6 and 24 months after treatment. The main outcome measures were expert-rated voice quality on a grade, roughness, breathiness, asthenia, and strain (GRBAS) scale, videolaryngostroboscopic findings, and the patients’ self-rated voice quality and its impact on activities of daily living. Results: Overall voice quality between the groups was rated similar, but voice was more breathy and the glottal gap was wider in patients treated with laser surgery than
Reprint requests to: Leena-Maija Aaltonen, MD, PhD, Department of Otorhinolaryngology e Head and Neck Surgery, Helsinki University Cen tral Hospital, PO Box 220, FI-00029 HUS, Finland. Tel: ( þ 358) 504271493; E-mail: leena-maija.aaltonen@hus.fi Int J Radiation Oncol Biol Phys, Vol. 90, No. 2, pp. 255 e 260, 2014 0360-3016/$ - see front matter 2014 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ijrobp.2014.06.032
Conflict of interest: none. Acknowledgments d The authors thank the members of the Finnish Head and Neck Oncology Working Group and Timo Muhonen, MD, PhD for collaboration and advice, and Carol Norris, PhD for language editing.
256
International Journal of Radiation Oncology Biology Physics
Aaltonen et al.
in those who received radiation therapy. Patients treated with radiation therapy re ported less hoarseness-related inconvenience in daily living 2 years after treatment. Three patients in each group had local cancer recurrence within 2 years from random ization. Conclusions: Radiation therapy may be the treatment of choice for patients whose re quirements for voice quality are demanding. Overall voice quality was similar in both treatment groups, however, indicating a need for careful consideration of patient related factors in the choice of a treatment option. 2014 Elsevier Inc.
handicap. These results suggest that radiation ther apy may be a treatment of choice for patients whose requirements for voice quality are demanding. However, careful consider ation of patient-related fac tors is essential when choosing the treatment option.
Patients
Introduction
Eligible patients had previously untreated, histologically confirmed squamous cell carcinoma limited to 1 mobile vocal cord, staged as T1aN0M0 (5), and were capable of collaboration in voice evaluation tests. To achieve homo geneity in voice quality assessments, and because glottal cancer is infrequent in women (6), female patients were not included. The study protocol was approved by an ethics committee of the Helsinki and Uusimaa Hospital District. The patients provided written consent before entry into the study. The study was not registered (trial registration was not the norm in 1998). Study participants were randomized to the treatments at a 1:1 ratio by means of a computer program with random digits weighted according to the proportions of past randomiza tions to yield a roughly balanced number of randomizations between the groups in a concealed fashion. Randomization was carried out by a hospital staff member who was inde pendent of the study. Tumor site in the vocal cord was used as a stratification factor at randomization. The result of randomization was communicated to the centers by phone. Patients allocated to radiation therapy had their treatments started within 6 weeks after randomization. The larynx was irradiated with 6-MeV photons from 2 opposing 4.5 4.5 to 5 5 cm wedge fields to a total cumulative dose of 66 Gy in 2-Gy daily fractions over 6.5 weeks with a linear accelerator. When necessary an anterior bolus was added to achieve the desired dose at the anterior commissure. The uniformity criteria within the planned target volume were defined according to the International Commission on Ra diation Units and Measurements Report 50 (7). The clinical target volume encompassed the larynx with no attempt to irradiate the regional lymphatics. Patients assigned to TLS had the tumor excised under general anesthesia within 6 weeks from randomization by Randomization Treatments
Patients with early glottal carcinoma are usually treated with either radiation therapy or transoral laser surgery (TLS). The choice between these treatments is controver sial, and treatment may vary by geographic region (1). The current evidence is insufficient to guide management decisions (2), and a recent systematic review and meta analysis failed to identify a single randomized study comparing TLS with radiation therapy (1). The meta analysis suggested that a higher larynx preservation rate can be achieved with TLS, but owing to the generally poor quality of the evidence, this result was considered inconclusive. The relative merits of TLS and radiation therapy for local cancer control, the functional outcomes including voice quality, and laryngeal preservation remain unknown. Inasmuch as most (85%-95%) laryngeal carcinomas limited to 1 vocal cord are cured either by TLS or by radiation therapy (3, 4), a key objective of treatment beyond recovery is maintaining good voice quality. We compared in the present randomized study the effects of TLS and external beam radiation therapy on quality of voice in a patient population with early glottal carcinoma limited to 1 vocal cord.
Methods and Materials
Design
The purpose of this randomized, multicenter, parallel-group study was to compare TLS with external beam radiation therapy as the primary treatment for early glottal carcinoma limited to 1 vocal cord. The primary endpoint was voice quality 2 years after treatment.
Setting
Given that the outcome of laser surgery, in particular, may depend on the operator’s skills, the study took place in large referral hospitals. The 3 largest university hospitals in Finland participated.
257
Volume 90 Number 2 2014
Voice quality in early laryngeal cancer
use of a CO 2 laser. The operations were performed as described elsewhere (8, 9) by only 7 experienced senior surgeons. In short, the tumors were first split, and tumor tissue was removed down to a macroscopically healthy muscle layer. After tumor excision, small biopsy specimens were taken to ensure complete (R0) removal.
that a sample size of 30 patients per group could yield a significant difference in the primary endpoint between the groups. The statistical analyses were done with a PASW Sta tistics 18 computer program (SPSS Inc, Chicago, IL). Repeated measurements between the groups over time were tested with repeated-measures analysis of variance. When an assumption for sphericity was violated, the degrees of freedom were corrected with Greenhouse-Geisser estimates of sphericity. Frequency tables were compared with the c 2 test or Fisher exact test. Relationships between the aspects of voice quality (the GRBAS components), self-rated quality of voice, impact of voice quality on activities of daily living, and videolaryngostroboscopic findings were analyzed by computing the Pearson correlation coefficient ( r ). To assess interrater and intrarater reliability in judging voice quality, Cronbach a was calculated on each of the 5 dimensions of the GRBAS scale. The P values are 2-sided.
Evaluation of voice quality
The patients had prescheduled follow-up visits at 2-month to 3-month intervals at the hospital otorhinolaryngology outpatient clinics. Voice samples were collected at baseline and 6 and 24 months after treatment. The larynx was examined with videolaryngostroboscopy, and the patients filled in questionnaires. Expert-rated evaluation of voice was based on 3 sen tences read aloud at a comfortable speech loudness level. The sentences included 17 words and lasted about 14.5 seconds. Voice samples were uploaded onto a compact disk in a random order. A total of 150 voice samples were analyzed by 3 trained speech and language therapists with expertise in voice quality evaluation who were blinded to the treatment groups. Voice quality was assessed on the GRBAS scale, consisting of grade (G), reflecting overall voice quality; roughness (R); breathiness (B); asthenia (A); and strain (S). Ratings of these 5 aspects of voice quality varied from 0 (normal) to 3 (extremely abnormal) (10). The higher the score, the more dysphonic the voice. Interrater consistency was assessed by comparing the original ratings of all voice samples (150 items) between the raters. To assess intrarater consistency, the voice sam ples were randomized, after which the samples of 10 pa tients were randomly duplicated on a compact disk and reassessed by the raters. The patients rated their own voice quality (hoarseness) and assessed its impact on daily life using a 100-mm long visual analogue scale (VAS) with end anchors “no degree of” and “high degree of” hoarseness and inconvenience. The videolaryngostroboscopic findings were evaluated by a panel of 3 phoniatricians, who were blinded to the treatment groups and uninvolved with the treatments. Adduction and abduction movements of the vocal cords, the amplitude and the phase symmetry of the mucosal wave, glottal closure, and signs of vocal cord hyperfunction were each assessed on a scale from 0 (no pathology) to 3 (major pathology).
Results
Patients
Sixty patients entered the study between June 1998 and October 2008. Of these, 28 were randomly assigned to radiation therapy and 32 to TLS. Four patients were excluded from the analysis (1 was female; 3 withdrew consent), leaving 25 evaluable patients in the radiation therapy group and 31 in the TLS group (Fig. 1). The me dian age at study entry was 65 years (Table 1).
Expert-rated quality of voice
Before beginning the study treatments, the patients had generally more breathy and rough voices compared with the normal voice, but there was no significant difference between the groups. The mean scores in expert-rated overall voice quality (G), voice roughness (R), and strain (S) remained similar between the groups during follow-up, but patients treated with TLS had a more breathy voice than those who received radiation therapy (score 1.52 vs 0.28 2 years after treatment, P < .001) (Table 2). A statistically significant difference emerged also in asthenia (0.74 vs 0.11; P Z .003), but in both groups the absolute value was under 1, suggesting limited clinical relevance of this finding. No significant change in overall voice quality (G) occurred during follow-up, but voice breathiness and asthenia improved significantly with time in the radiation therapy group (from 1.17 at baseline to 0.28 2 years after treatment, P < .001; and from 0.56 to 0.11, P Z .001, respectively) but not in the TLS group. The degree of voice breathiness varied substantially. In the TLS group, 20 (74%) of the 27 evaluable patients had mildly or moderately breathy voice (score 1 or 2) 2 years after treatment, 2 (7%) patients had an extremely breathy
Statistical analysis
The primary analysis was intention-to-treat and involved all eligible patients who provided informed consent, were male, and had laryngeal cancer. We found the voice quality data available in the litera ture to be in part conflicting and to offer little guidance for estimation of the study sample size. Therefore, we were not able to perform formal power calculations. We estimated
258
International Journal of Radiation Oncology Biology Physics
Aaltonen et al.
60 patients with squamous cell carcinoma limited to 1 vocal cord (T1a)
28 assigned to radiation therapy
32 assigned to laser surgery
Excluded: 2 withdrew consent 1 was female
Excluded: 1 withdrew consent
25 treated
31 treated
The CONSORT (Consolidated Standards of Reporting Trials) flow diagram.
Fig. 1.
voice (score 3), and only 5 (19%) had no voice breathiness (score 0). These numbers contrast with the radiation ther apy group, in which 6 (30%) of the 20 evaluable patients had a mildly or moderately breathy voice 2 years after ra diation therapy, none had an extremely breathy voice, and 14 (70%) had no voice breathiness. Patients with tumor in the anterior part of the vocal cord had a more breathy voice when treated with TLS than did those treated with radiation therapy; in the TLS group the scores were 1.55 at baseline and 1.63 2 years after treatment, and in the radiation ther apy group the scores were 1.38 and 0.66, respectively ( P Z .039). No significant difference emerged when cancer was located in the posterior part of the vocal cord.
significantly from the baseline quality during follow-up in each group (in the TLS group, the VAS score decreased from 59.0 to 43.1, P Z .040; in the radiation therapy group, from 53.1 to 35.4, P Z .026). Patients assigned to radiation therapy reported less impact of hoarseness on their daily living activities than did patients assigned to TLS ( P Z .007).
Videolaryngostroboscopic findings
In comparison with the radiation therapy group, patients assigned to TLS had less sufficient glottal function at vid eolaryngostroboscopy performed 2 years after study entry. They had higher scores for irregular glottal closure ( P Z .025), oval closure ( P Z .005), and incomplete glottal closure ( P Z .018).
Self-rated quality of voice and impact of hoarseness on activities of daily living
When the patients themselves rated hoarseness, voice quality was judged as similar between the groups ( P Z .144) (Table 3). The self-reported quality of voice improved
Concordance of findings
Interrater consistency was good when voice grade, rough ness, or breathiness was assessed (Cronbach a 0.88, 0.88, and 0.84, respectively) but weak for voice asthenia and strain ( < 0.70 for each). Intrarater consistency was excellent or good when voice grade, breathiness, and asthenia were rated (ranges, 0.90-0.95, 0.85-0.89, and 0.87-0.96), mod erate for roughness (0.70-0.79), and weak for strain ( < 0.70). In general, expert-rated voice quality assessments, self rated voice quality, and the stroboscopic findings showed good concordance. High scores for expert-rated breathiness were strongly associated with the degree of self-reported handicap in daily living (at 6-month assessment, r Z 0.568, P Z .001; at 24-month assessment, r Z 0.623, P < .001) and self-rated hoarseness (at 6 months, r Z 0.503, P Z .003; at 24 months, r Z 0.482, P Z .005). Voice breathiness
Characteristics of patients and tumors
Table 1
Radiation therapy group (n Z 25)
Laser surgery group (n Z 31)
Characteristic
Median age, y (range)
69.0 (46-83)
61.0 (46-75)
Tumor histology (n, %) Squamous cell carcinoma 31 (100)
25 (100)
Clinical stage (n, %) T1a
31 (100)
25 (100)
Tumor site on the vocal cord (n, %) Anterior
6 (19) 9 (29) 7 (22) 9 (29)
7 (28) 6 (24) 4 (16) 8 (32)
Anterior-middle
Middle or posterior
Entire cord
259
Volume 90 Number 2 2014
Voice quality in early laryngeal cancer
Expert-rated voice quality
Table 2
Transoral laser surgery
Radiation therapy
P y
P *
P *
Measurement
Baseline
6 months
24 months
Baseline
6 months
24 months
Grade
1.61 (0.94) 1.30 (0.88) 1.35 (0.89) 0.61 (0.66) 0.96 (0.71)
1.78 (0.16) 1.13 (0.92) 1.48 (0.90) 0.75 (0.68) 0.83 (0.72)
1.61 (0.17) 1.26 (0.86) 1.52 (0.95) 0.74 (0.69) 0.78 (0.80)
.537 1.44 (0.92) .699 1.22 (0,73) .617 1.17 (0.79) .599 0.56 (0.51) .532 0.83 (0.92)
1.56 (0.18) 1.56 (0.92) 0.44 (0.62) 0.06 (0.24) 0.89 (0.68)
1.39 (0.19) 1.39 (0.61) 0.28 (0.58) 0.11 (0.32) 1.06 (0.80)
.614 .967 .284 .248
Roughness Breathiness
< .001
< .001
Asthenia
.001 .003
Strain .498 .288 The data are mean values (standard deviations) of the grade, roughness, breathiness, asthenia, and strain (GRBAS) scale used to assess perceptual voice quality. Each aspect of voice quality was scored from 0 to 3, the higher scores indicating worse voice quality. The baseline voice sample was not available from 1 patient in the radiation therapy group, the 6-month sample from 7 patients in the laser surgery group and from 2 patients assigned to radiation therapy, and the 24-month sample from 4 patients treated with laser surgery and from 5 treated with radiation therapy. * Change during the follow-up as compared with the baseline value within the treatment group (repeated-measures analysis of variance). y Difference between the treatment groups (repeated-measures analysis of variance).
correlated with the presence and extent of an irregular vocal cord chink at stroboscopy ( r Z 0.511, P Z .001), with an oval chink ( r Z 0.571, P < .001), and with otherwise incomplete glottal closure of the vocal cords ( r Z 0.565, P < .001).
overall voice quality achieved was roughly similar after the treatments, but patients treated with radiation therapy had less breathy voice, maintained better glottal closure, and experienced less inconvenience in their daily lives from their voice quality than those treated with TLS. In line with the absence of randomized trials in the literature, we found conducting of the current trial chal lenging, with accrual lasting for 10 years. We estimate that 80% of the eligible patients did not enter the study in the participating centers, creating a possibility for selection bias (6). We did our best to improve accrual, but the accrual rate remained low. We persisted with the study because we believe that a randomized trial, even a slowly accruing trial, allows more reliable evaluation of the treatments than nonrandomized cohort studies. A recent Cochrane review concluded that reliable con clusions regarding the quality of voice achieved with TLS and radiation therapy cannot be made in the absence of randomized trials (2). Several cohort studies conclude that radiation therapy leads to better voice quality, whereas others suggest that the overall voice quality is similar (2). Voice quality after TLS depends on several factors such as the surgeon’s experience and skill, tumor size, and site. We attempted to account for such factors by limiting the study to large referral hospitals and to small tumors. Voice breathiness (B) differed most clearly between the groups. Notably, breathiness improved after radiation therapy over the 2-year observation period, whereas no
Cancer recurrence and survival
Fifteen (48%) of the 31 patients assigned to TLS underwent 1 or more unscheduled laryngomicroscopies after treatment as compared with 6 (24%) of the 25 patients assigned to radiation therapy ( P Z .077; the data were missing for 1 patient assigned to radiation therapy). Three (10%) patients in the TLS group and 3 (12%) in the radiation therapy group had cancer recurrence during the 2-year follow-up period. One patient in the TLS group received a diagnosis of metastatic laryngeal cancer and died, and 1 had a second primary tumor in the contralateral vocal cord 1.5 years after entry into the study. None of the patients allocated to ra diation therapy experienced metastases.
Discussion
TLS and radiation therapy for early glottal carcinoma have been investigated in numerous prospective and retrospec tive cohort studies, but no randomized studies have been reported (1, 2). We found in this randomized study that the
Self-rated voice quality
Table 3
Transoral laser surgery
Radiation therapy
P y
P * Baseline
P *
Measurement
Baseline
6 months 24 months
6 months 24 months
Hoarseness
59.0 (19,0) 50.7 (28.9) 43.1 (27.1) .040 53.1 (22.0) 34.1 (24.3) 35.4 (26.7) .026 .144
Impact on everyday life 44.6 (26.5) 31.4 (25.9) 32.4 (25.3) .089 32.1 (25.7) 14.4 (18.8) 8.40 (9.3) .001 .007 The data are mean values (standard deviations) evaluated on a visual analogue scale (VAS) scored from 0 to 100. The higher scores indicate worse subjective impression of the quality of voice. The baseline VAS was missing from 5 patients in the laser surgery group and from 1 patient in the radiation therapy group, and the 6-month sample from 7 and 3 patients, and the 24-month sample from 6 and 8 patients, respectively. * Change during the follow-up as compared with the baseline value within the treatment group (repeated-measures analysis of variance). y Difference between the treatment groups (repeated-measures analysis of variance).
260
anterior tumor location in the vocal cord is associated with a breathy voice when cancer is treated with TLS. TLS has the advantage of being completed within 1 day, which may also influence patient preference. A larger study is war ranted to compare the effects on survival. International Journal of Radiation Oncology Biology Physics
Aaltonen et al.
improvement in any of the 5 voice quality measures of the GRBAS scale occurred in the TLS group, suggesting that the vocal cord defect caused by carcinoma and TLS frequently causes long-lasting voice impairment. Yet, in dividual compensation is an important factor contributing to final voice quality, and it may sometimes lead to an excellent voice (11). Rehabilitative speech therapy facili tating functional compensation should therefore be considered. Self-rated voice quality and the level of perceived inconvenience are important to evaluate (12, 13) because they relate to occupational and social demands. A meta analysis of 6 studies in patients with T1 glottal cancer concluded that TLS with CO 2 laser and external beam ra diation therapy resulted in similar levels of voice handicap (14). The voice handicap has been considered mild (15) and comparable to that caused by benign glottal tumors (14, 16). Only 16 (29%) of the patients were employed at the time of entry into the study, and one could argue that for most patients a mildly or moderately breathy voice may not cause a substantial handicap, but the patient is likely the best judge. This trial has some limitations. A larger series could have identified further differences between the treatments. Yet, statistically significant differences emerged, suggesting that the trial was not underpowered with regard to the main endpoint despite its limited size. The study was not pow ered to compare recurrence-free or overall survival between the treatment groups. Such endpoints require a much larger study because of the generally good prognosis of small glottal carcinomas. Voice quality might change after the first 2 years of follow-up, and a longer follow-up time may be warranted. When the study began, the Voice Handicap Index (VHI) had not yet been validated in the Finnish language, and thus it could not serve as a subjective voice evaluation instrument. Female patients were excluded because of the infrequency of laryngeal cancer in women and to achieve a more homogenous study population, and therefore the results may not be applicable to female pa tients with laryngeal cancer. Smoking habits may influence the quality of voice, but these data were insufficient for analysis. In conclusion, radiation therapy results in less breathy voice than TLS, but the overall voice quality was similar. Radiation therapy may be the treatment of choice when the requirements for the voice quality are demanding. An
References
1. Abdurehim Y, Hua Z, Yasin Y, et al. Transoral laser surgery versus radiotherapy: Systematic review and meta-analysis for treatment op tions of T1a glottic cancer. Head Neck 2012;34:23-33. 2. Dey P, Arnold D, Wight R, et al. Radiotherapy versus open surgery versus endolaryngeal surgery (with or without laser) for early laryn geal squamous cell cancer. Cochrane Database Syst Rev 2010;2: CD002027. 3. Kanonier G, Fritsch E, Rainer T, et al. Radiotherapy in early glottic carcinoma. Ann Otol Rhinol Laryngol 1996;105:759-763. 4. Steiner W. Radiotherapy in early glottic carcinoma. Am J Otolaryngol 1993;14:116-121. 5. International Union Against Cancer on Cancer. In: Sobin L, Gospodarowicz M, Wittekind C, editors. TNM classification of ma lignant tumours. 7th ed. Chichester: Wiley-Blackwell; 2010. 6. Jemal A, Bray F, Center MM, et al. Global cancer statistics. CA Cancer J Clin 2011;61:69-90. 7. International Commission on Radiation Units and Measurements Report 50: Prescribing, recording and reporting photon beam therapy . Washington, DC: International Commission on Radiation Units and Measurements; 1993. 8. Strong MS. Laser excision of carcinoma of the larynx. Laryngoscope 1975;85:1286-1289. 9. Remacle M, Eckel HE, Antonelli A, et al. Endoscopic cordectomy. A proposal for a classification by the Working Committee, Euro pean Laryngological Society. Eur Arch Otorhinlaryngol 2000;257: 227-231. 10. Hirano M. Clinical examination of voice. Wien, Austria: Springer Verlag; 1981. 11. Galletti B, Freni F, Cammaroto G, et al. Vocal outcome after CO 2 laser cordectomy performed on patients affected by early glottic carcinoma. J Voice 2012;26:801-805. 12. Hogikyan ND, Rosen CA. A review of outcome measurements for voice disorders. Arch Otolaryngol Head Neck Surg 2002;126:562-572. 13. Carding P. Evaluating voice therapy: Measuring the effectiveness of treatment. London: Whurr Publishers; 2000. 14. Cohen SM, Garrett G, Dupont WD, et al. Voice-related quality of life in T1 glottic cancer: Irradiation versus endoscopic excision. AnnOtol Rhinol Laryngol 2006;115:581-586. 15. Nu´n˜ez Batalla F, Caminero Cueva MJ, Sen˜aris Gonza´lez B, et al. Voice quality after endoscopic laser surgery and radiotherapy for early glottic cancer: Objective measurements emphasizing the Voice Handicap Index. Eur Arch Otorhinolaryngol 2008;265:543-548. 16. Van Gogh CD, Mahieu HF, Kuik DJ, et al. Voice in early glottic cancer compared to benign voice pathology. Eur Arch Otorhinolaryngol 2007;264:1033-1038.
The Laryngoscope V C 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Does Narrow Band Imaging Improve Preoperative Detection of Glottic Malignancy? A Matched Comparison Study
Hagit Shoffel-Havakuk, MD; Yonatan Lahav, MD; Barak Meidan, MD; Yaara Haimovich, BSc; Meir Warman, MD; Moshe Hain, MD; Yaniv Hamzany, MD; Alexander Brodsky, MD; Tali Landau-Zemer, MD; Doron Halperin, MD
Objectives/Hypothesis: The primary suspicion for glottic malignancy during office laryngoendoscopy is based on lesion appearance. Previous studies investigating laryngeal use of narrow band imaging (NBI) are mostly descriptive. The additive value of NBI relative to white light (WL) requires further investigation. Study Design: Observational matched study. Methods: NBI was compared with WL images of 45 vocal fold lesions suspected for malignancy (21 carcinoma, 22 dys plasia, two benign). All images were presented randomly and evaluated by six independent otolaryngology specialists. The observers were asked to estimate lesion size, location, and pathology. The results for the two imaging modalities were com pared with each other and with the final pathology. Results: The observers estimated lesion size to be larger in the NBI images by an average of 9% (2.4 mm 2 ; P 5 .04) compared to WL. In 64.6% of cases, the observers estimated similar pathology for NBI and WL. When there was a discrepan cy, the estimated pathology was “malignant” in 24.3% by NBI, compared with 11.1% by WL. Overall, 44.7% of the lesions were estimated to be malignant by NBI, compared with 33.8% by WL ( P 5 .001). The sensitivity and specificity rates for malignancy detection by NBI were 58.6% and 61.2%, respectively, compared to 48.7% and 76.1% by WL. Conclusions: Observers tend to estimate vocal fold lesions to be larger and more frequently suspect malignancy while assessing NBI images. Compared with WL, NBI demonstrates increased sensitivity and decreased specificity for detection of
malignancy. Nevertheless, the specificity and sensitivity of NBI alone are considerably low. KeyWords: Laryngoscopy, narrow band imaging, early glottic cancer, vascularization, larynx. Level of Evidence: 4
Laryngoscope , 127:894–899, 2017
INTRODUCTION Malignant glottic lesions often express typical fea tures on initial office laryngoscopy, suggesting further workup and treatment. However, histopathology of a formalin-fixed tissue sample remains the gold standard for final diagnosis. Management of glottic lesions suspected of From the Department of Otolaryngology–Head and Neck Surgery, Kaplan Medical Center, Rehovot ( H . S .- H ., Y . L ., Y . HAIMOVICH , M . W ., D . H .); Hadassah Medical School, Hebrew University, Jerusalem( H . S .- H ., Y . L ., B . M ., M . W ., T . L .- Z ., D . H .); Schneider Children’s Medical Center, Petah Tikva ( M . H .); Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv ( M . H ., Y . H ); Department of Otolaryngology–Head and Neck Surgery, Rabin Med ical Center, Petah Tikva ( Y . H ); Department of Otolaryngology–Head and Neck Surgery, Bnai Zion Medical Center, Haifa ( A . B .); Rappaport Faculty of Medicine, Technion–Israel Institute of Technology, Haifa ( A . B .); and Department of Otolaryngology, Head and Neck Surgery, Hadassah Medi cal Center, Jerusalem ( T . L .- Z .), Israel. Editor’s Note: This Manuscript was accepted for publication on August 1, 2016. Presented as a podium presentation at the American Laryngologi cal Association annual meeting at the Combined Otolaryngological Spring Meetings, Chicago, Illinois, U.S.A., May 18–21, 2016. H . S .- H . and Y . L . contributed equally to this work. The authors have no funding, financial relationships, or conflicts of interest to disclose. Send correspondence to Yonatan Lahav, MD, Department of Oto laryngology–Head and Neck Surgery, Kaplan Medical Center, Pasternak St. P.O.B 1, Rehovot, 76100 Israel. E-mail: yonatan.lahav@gmail.com
malignancy should balance voice preservation and func tionality, along with the need for sufficient tissue sampling. This dilemma results in clinicians’ continuous pursuit of a reliable, noninvasive tool to detect true malignancies and avoid unnecessary biopsies. For that reason, biologic endoscopy techniques attract clinicians’ attention. The term biologic endoscopy encompasses an array of diagnos tic tools including toluidine blue staining, autofluores cence, and confocal microendoscopy. 1 In the past decade, new biologic endoscopy techni ques using optical filters and amplifications, such as nar row band imaging (NBI), have become more common and broadly investigated. As opposed to other biologic endosco py techniques, NBI does not focus on biological properties of the neoplasm itself, but highlights its vascularization. 2 NBI was primarily introduced in gastrointestinal (GI) endoscopy, 3–8 and its use was later extended to other med ical specialties as otolaryngology. 2 NBI applies filters that narrow the frequency range of light into bands of blue ( 415 nm) and green ( 540 nm). The blue and green lights enhance visualization of mucosal and submucosal microvascularization, based on their absorption by hemo globin and the different depth of penetration of different light wavelengths. 2 Studies of the GI system established the notion that morphological changes of intraepithelial
DOI: 10.1002/lary.26263
Laryngoscope 127: April 2017
Shoffel-Havakuk et al.: NBI Preoperative Detection of Malignancy
894
15314995, 2017, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/lary.26263 by Karuna Dewan - Ochsner Medical Foundation - New Orleans - USA , Wiley Online Library on [21/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
papillary capillary loops (IPCL) are well demonstrated by NBI and may correlate with depth of invasion of super ficial mucosal carcinoma. 7,8 Gradually, NBI became a valuable tool in the early detection of esophageal 3,4 and colon 5,6 cancer. NBI has only recently been introduced to laryngeal endoscopy, and has gained popularity in the identification and follow-up of papillomatosis 9,10 as well as malignant and premalignant lesions. 11–22 Early studies investigating the use of NBI in the larynx emerged at the beginning of the present decade. 11–14 The latter studies were mostly descriptive, trying to define the characteristics that differ entiate malignant from benign laryngeal lesions. In 2011, Ni et al. applied IPCL classification to 104 suspected pre cancerous or cancerous laryngeal lesions. The authors concluded that the IPCL classification using NBI was closely associated with the histological findings; whereas type I to IV lesions were considered nonmalignant, type V lesions correlated with malignancy. IPCL types Vb and Vc, which represent irregular, tortuous, linelike IPCLs had higher sensitivity and specificity for invasive carcinoma. 11 The validation of laryngeal NBI is better established in the operating room (OR); Piazza et al. examined a series of patients with preoperative flexible NBI and showed 61% sensitivity and 87% specificity for detection of malignancy. However, intraoperative high-definition television (HDTV) NBI in the same patients resulted in higher rates of both sensitivity and specificity: 98% and 90%, respectively. 12 Consequently, the use of intraopera tive HDTV NBI was later suggested to reduce the inci dence of positive superficial surgical margins during transoral laser microsurgery (TLM). 15 The OR setting facilitates the use of HDTV with rigid close-up endoscopy, and therefore maximizes NBI’s benefits. Nevertheless, the additive value of NBI in the more challenging setting of an office examination in awake patients has not yet been confirmed. In our study, we aimed to investigate and further delineate the role of NBI preoperative assessment in the office and in the decision-making process with suspected lesions. The study attempts to evaluate differences in the estimation of lesions’ size and pathology using NBI, compared to white light (WL). MATERIALS AND METHODS This was an observational matched study that compared the interpretation of NBI and WL images of vocal fold (VF) lesions suspected of malignancy by otolaryngology specialists. The study was approved by the institutional ethics committee. Forty-five VF lesions from 36 patients were included in the study. All patients were examined with both WL and NBI imaging modalities using flexible endoscopy with an ENF-V2 digital video rhinolaryngoscope (Olympus Medical System Cor poration, Tokyo, Japan) in our referral center’s voice clinic between 2013 and 2015. The examination revealed unilateral or bilateral VF lesions suspected of malignancy. Exclusion criteria were: 1) patient was not examined by both WL and NBI; 2) WL or NBI images had a degraded quality; and 3) were unable to provide WL and NBI images that were similar in their quality, distance, and angle. All lesions included in the study were
biopsied; 21 were invasive carcinoma, 22 were mild or moderate dysplasia, and two lesions were benign. Two images, one WL and the other NBI, were selected for each lesion. The selected images demonstrated the lesion clearly along with specific anatomical landmarks such as the anterior commissure and the vocal process. Moreover, the two images were comparable regarding their quality and their distance and angle from the VF. All images were presented randomly and evaluated by six otolaryngology specialists (the observers). Each image was pre sented on a personal computer screen monitor using Power Point 2007 (Microsoft Corporation, Redmond, WA). Each slide contained one image of the lesion (WL or NBI) along with three tasks for the observers: 1) to draw the lesion’s margins with a pen on a schematic VF image, 2) to estimate the lesion’s size using a multiple choice answer, and 3) to estimate the lesion’s pathology by a multiple choice question. The analysis of the first task (drawing the lesion on the schematic image) was done using a dense grid system and manual measurements to calcu late the area of the drawn lesion and the distance of its margins from the vocal process and the anterior commissure. Assuming the length of the VF in the schematic image was 18 mm, each square of the grid system in the schematic image represented 0.46 3 0.46 mm 2 . Samples for completion of this task can be found in Figure 1. In the second task, the observers answered the following multiple choice question: “What is the lesion’s size: less than one-third of the VF area; between one-third and two-third; or over two-third?” To reach the answer, the observ ers were instructed to consider the traditional segmentation of the VF into thirds, assuming the anterior two-thirds are the membranous portion of the VF, and the posterior third is the cartilaginous portion. In the third task, the observers answered the following multiple choice question: “What is the lesion’s probable pathology diagnosis: nodule/cyst/polyp, papillomatosis, benign keratosis, dysplasia, invasive carcinoma, other, or unknown?” The observers were directed to avoid the answer “unknown” when possible. This question was also used to calcu late the sensitivity, specificity, and positive and negative predic tive values for detection of malignancy. The images were presented to the observers in three sepa rated sessions. The first session included a simulation of the questionnaire. The objective of this session was to familiarize the observers with NBI images and with the study’s methods. Data from this session were not included in the analysis. In the second session, the 45 lesions were presented to the observers randomly, either by WL or by NBI. In the third session, the 45 lesions were presented randomly again, this time by the alter nating imaging modality (if a lesion was presented by WL in the second session, it was presented by NBI in the third session and vice versa). Altogether, each of the six observers viewed 90 images of 45 lesions (one WL and one NBI) in two sessions; in each session only one image of each lesion was presented. The observers were not restricted in time to fulfill the tasks. Howev er, there was an essential recess of at least 30 minutes between each session. The results of the two imaging modalities were compared with each other and with the final pathology. To test the associ ation between two categorical variables, v 2 test was used. Paired t test was applied to test differences for quantitative var iables. All tests applied were two-tailed, and a probability value of 5% or less was considered statistically significant. Kappa coefficient was performed to assess agreement of categorical variables, and intraclass correlation coefficient (ICC) was used to calculate the agreement of quantitative variables. Statistical analyses were performed using SPSS Statistics version 20.0 (IBM, Armonk, NY).
Laryngoscope 127: April 2017
Shoffel-Havakuk et al.: NBI Preoperative Detection of Malignancy 895
15314995, 2017, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/lary.26263 by Karuna Dewan - Ochsner Medical Foundation - New Orleans - USA , Wiley Online Library on [21/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 1. The following figures consist of representative images from our study. Images from left to right in each panel: 1) the white light (WL) image of the lesion that was presented to the observers; 2) the narrow band imaging (NBI) image of the same lesion that was presented to the observers; 3) a drawing of the lesion (red line) by one of the observers on a schematic vocal fold (VF) image as seen with WL; and 4) the drawing of the lesion (red line) as seen with NBI, by the same observer. Each panel presents drawings by a different observer. (A) Invasive squamous cell carcinoma. WL laryn goscopy revealed an exophytic lesion on the right VF. NBI enhanced irregular intraepithelial papillary capillary loops (IPCLs) and blue to brown dots. The observer drew the lesion’s perimeters as more extended by NBI relative to WL. (B) Microinvasive squamous cell carcinoma. WL revealed a lesion encompassing the left VF along with granulation tissue and fibrin. NBI enhanced an irregular pattern of IPCLs with blue to brown dots. The observer drew the lesion as a continuous lesion with extended perimeters by NBI compared to the drawing by WL. (C) Bilateral mild dysplasia. WL revealed bilateral superficial spread of keratin. NBI enhanced the keratin in contrast to the irregular mucosal vascularization. The observer drew the right VF lesion’s perimeters as more extended by NBI.
RESULTS Estimation of Lesion Size and Location
result of subjective interpretations and the tendencies of the different clinicians. ICC was used to validate the agreement for this tendency among the observers. The ratio between the lesion area using NBI and WL images was calculated for each observer, and the agreement between the observers using ICC was 0.832 (95% CI 5 0.7–0.918). In the drawings, the lesion margins tended to be closer to the vocal process and the anterior commissure when drawn by NBI. The drawn lesions were signifi cantly closer to the vocal process using NBI compared to WL by an average of 0.6 mm ( P 5 .002; Table II). The observers were asked to estimate the lesion size by assigning the lesion to one of three size categories. Using WL, 27.8% were estimated to be less than one-third
When the observers were asked to draw the lesions’ margins on a VF schematic image, the lesion area with NBI was larger than that of WL by an average of 2.4 mm 2 (95% confidence interval [CI] 5 0.8–4.1). This accounts for a 9% increase in the estimated area. The average area of the lesions was larger using NBI compared with WL: 29.9 6 18.9 mm 2 versus 28.1 6 19.3 mm 2 , respectively ( P 5 .04). Analyzing the results for each observer separate ly, the differences in the estimated area demonstrated an increase using NBI in five of the six observers (Table I). A substantial difference between the observers’ area estima tions (e.g., 38.7 mm 2 and 18.8 mm 2 in WL lesion area of observers 1 and 6, respectively) was demonstrated as a
Laryngoscope 127: April 2017
Shoffel-Havakuk et al.: NBI Preoperative Detection of Malignancy
896
15314995, 2017, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/lary.26263 by Karuna Dewan - Ochsner Medical Foundation - New Orleans - USA , Wiley Online Library on [21/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
TABLE I. Lesion Area by NBI and WL Drawn on a Schematic Vocal Fold Image.
Average Lesion Area byWL 6 SD, mm 2
Average Lesion Area by NBI 6 SD, mm 2
Observer
Difference (95% CI)
P
38.7 6 25.5 41.1 6 21.5 19.4 6 11.6 22.6 6 15.6 28.0 6 13.1 18.8 6 10.0 28.1 6 19.3
36.9 6 21.2 44.5 6 21.3 21.4 6 12.5 25. 8 6 18.3 31.1 6 15.5 20.8 6 9.8 29.9 6 18.9
2 0.23 ( 1 4.82, 2 5.29) 1 4.52 ( 1 8.96, 1 0.86) 1 1.97 ( 1 5.51, 2 1.55) 1 3.14 ( 1 6.72, 2 0.44) 1 2.86 ( 1 8.54, 2 2.82) 1 2.16 ( 1 5.17, 2 0.83) 1 2.41 ( 1 4.05, 1 0.76)
1
.927
2 3
.046 .266
4
.84
5 6
.312 .153
Total average
.04
Difference 5 (area by NBI) 2 (area by WL). Probability value was calculated by paired t test. CI 5 confidence interval; NBI 5 narrow band imaging; SD 5 standard deviation; WL 5 white light.
byWL ( P 5 .007). Regarding recognition of carcinoma, NBI was more sensitive compared to WL: 58.6% and 48.7%, res pectively. However, NBI was less specific: 61.2% compared with 76.1%, respectively. The performance measurements for identification of carcinoma by NBI and WL are summarized in Table III. DISCUSSION Under NBI conditions, premalignant or malignant glottic lesions appeared larger compared to that seen using WL. This study is the first to highlight this phe nomenon, documented by five of six independent blinded otolaryngologists. Moreover, there was more suspicion of malignancy using NBI and with a higher sensitivity thanWL. There are two possible explanations for lesions to be perceived as larger in NBI. First, the lesions are truly larger. NBI enhances vascular abnormalities and keratin. When these changes are vague, as in the tumor perimeter, the human eye may not detect them under WL imaging but can clearly see them by NBI. Therefore, it is possible that NBI enables us to see the true dimensions and boundaries of a lesion, which is larger than previously thought. Second, there might be some size overestimation related to optical illusions. Because the human brain is not used to this new modality and different colors, it per ceives and estimates the lesion to be larger than it really is. As the authors of this study, we believe that the major contribution to the increase in the estimated size using NBI is related to an improved detection of subtle changes at the periphery of the lesions. Further studies are needed to resolve this subject.
of the VF size, 45.1% were estimated between one-third and two-thirds, and 27.1% were estimated to be more than two-thirds. With NBI, the estimations were 21.1%, 45.1%, and 33.8%, respectively. The kappa value for the agreement between the WL and NBI size estimations was 0.49. Most of the lesions (66.9%) were estimated by the observers to be in the same size category whether pre sented by NBI or by WL; 22.6% of the lesions were esti mated to be larger by NBI, compared to 10.5% that were estimated as larger by WL. In cases of disagreement, the McNemar test showed homogeneity of a trend to estimate the lesions as larger by NBI ( P 5 .007). Estimation of Lesions’ Pathology When asked to estimate the pathological diagnosis by a multiple choice question, the observers tended to assess more lesions as “invasive carcinoma” when using NBI relative to WL: 44.7% versus 33.8%, respectively ( P 5 .001). Analyzing the results for each observer sepa rately, this tendency was also demonstrated in five of the six observers. In most of the cases (64.6%), there was agreement regarding lesion malignancy whether NBI or WL was presented. When there was a discrepan cy, the estimated pathology was “invasive carcinoma” in 24.3% for NBI, compared with 11.1% for WL ( P 5 .034). The 21 lesions with final pathology reports of “invasive carcinoma” were presented to the six observers (a total of 126 lesion presentations). NBI indicated “invasive carcinoma” in 53.9% of lesions compared to 45.2% when using WL ( P 5 .166). In the remaining 24 lesions, the final pathology was not “invasive carcinoma” (altogether 144 lesion presenta tions). In 37.5% of cases, the lesions were mistakenly estimat ed to be invasive carcinoma using NBI, compared with 22.9%
TABLE II. Average Distance From Anatomic Locations by NBI and WL Lesion Drawings on a Schematic Vocal Fold Image.
Lesion Distance by WL 6 SD, mm
Lesion Distance by NBI 6 SD, mm
Difference (95%CI)
Anatomic Location
P
2.9 6 3.1 1.5 6 2.8
2.4 6 3.3 1.3 6 2.7
Vocal process
0.6 (0.2, 1.0)
0.002
0.2 ( 2 0.05, 0.5)
Anterior commissure
0.104
Difference 5 (distance by WL) 2 (distance by NBI). Probability value was calculated by paired t test. CI 5 confidence interval; NBI 5 narrow band imaging; SD 5 standard deviation; WL 5 white light.
Laryngoscope 127: April 2017
Shoffel-Havakuk et al.: NBI Preoperative Detection of Malignancy 897
Made with FlippingBook Ebook Creator