AUCTORES
Globalize your Research
Case Report | DOI: https://doi.org/10.31579/2692-9562/085
1 Department of Speech Therapy, School of Paramedical Sciences, Mashhad University of Medical Sciences, Mashhad, Iran.
2 Sinuses and Surgical Endoscopic Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
3 Musculoskeletal Rehabilitation Research Center, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.
*Corresponding Author: Zahra Ghayoumi-Anaraki, Department of Speech Therapy, School of Paramedical Sciences, Mashhad University of Medical Sciences, Mashhad, Iran.
Citation: Zahra Ghayoumi-Anaraki, Azadeh Abedinzadeh, Ehsan Khadivi, Negin Moradi (2023), Vocal Rehabilitation after Provox Voice Prosthesis: A Longitudinal Case Report, Journal of Clinical Otorhinolaryngology, 5(4); DOI:10.31579/2692-9562/085
Copyright: © 2023, Zahra Ghayoumi-Anaraki. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
Received: 02 February 2023 | Accepted: 20 March 2023 | Published: 30 March 2023
Keywords: laryngectomy; tracheoesophageal fistula; voice disorders; speech therapy
One of the most obvious consequences of total laryngectomy is the loss of voice. The Provox voice prosthesis is now the most commonly used prosthesis for voice rehabilitation. Nevertheless, few studies have examined voice therapy's conclusions in patients who use Provox.
In this prospective case study, multiple measures (acoustic, aerodynamic, and prosodic measures) were used to assess the changes in vocal function consequent to a voice therapy program. The subject is a seventy-year-old man referred to an ENT specialist because of five months of hoarseness in his voice. Ten days after operation, he still did not have ability of phonation. So, voice therapy was suggested for him. Acoustic, aerodynamic, and prosodic parameters and voice-related quality of life were analyzed in the first, eighth, and follow-up voice therapy sessions.
An improvement is seen in all of the parameters except the jitter. Our results indicate that structured voice rehabilitation following the use of Provox helps laryngectomee to improve their voice.
The prevalence of laryngeal cancer is estimated at 14.33 cases/year per 100,000 people [1, 2]. The use of tobacco, predominantly cigarette smoking, has been classified as the major risk factor for cancer of the glottic regions [3]. Men are affected approximately seven times more often than women, and most cases arise in people aged between 50 and 60 years [4]. In spite of the presentation of organ preservation protocols, total laryngectomy (TL) is one of the most common operations for advanced laryngeal cancers [5]. One of the most obvious consequences of total laryngectomy is the loss of the natural voice [6]. Human communication uses language conveyed through speech as a fundamental characteristic [7]. An altered vocal quality distresses not only the audible vocal sound associated with a person’s individuality but also has influence on the functional and psychological facets of vocal communication that for many patients lead to an unsettled social life [8].
Voice restoration after laryngectomy is a vital and challenging goal for head and neck surgeons and speech pathologists [1]. Three principal possibilities are currently possible for voice restoration after total laryngectomy: esophageal speech (ES), electrolarynx speech, and tracheoesophageal voice prosthesis (TEP) [9]. The use of a voice prosthesis following a tracheoesophageal puncture procedure is the gold standard in voice rehabilitation after laryngectomy [10, 11]. The technique of voice restoration involves creating a puncture connecting the trachea and esophagus, which is called tracheoesophageal puncture [12]. The voice prosthesis as a one-way valve is inserted into this puncture. This prosthesis lets air pushed up from the lungs pass through from the trachea and enter the esophagus. This can cause the esophagus walls (neoglottis) vibrate as a new voice. In tracheoesophageal speech, like in normal speech, the pulmonary air is used for voice production [13].
The Provox voice prosthesis is now the most commonly used prosthesis for voice rehabilitation [14, 15]. Nevertheless, in a system in which an artificial tool (the prosthesis) is substituted for the sphincteric action of the larynx, the adaptive modifications of vocal tract and respiratory function bear the effective rehabilitation and ought to be estimated and addressed predominantly during speech therapy [16]. Tracheoesophageal (TE) speech is considered more adequate and closer to normal in comparison with esophageal speech and electrolarynx speech. However, it often shows low audibility and intelligibility, which makes it a challenge for the patients to communicate [17]. Therefore, to make the voice quality closer to normal, training is needed.
Nevertheless, not many studies have examined the conclusions of voice therapy in tracheoesophageal speech (TE) with voice prosthesis. Terada T et al. in a retrospective study analyzed the effectiveness of the Provox2 voice prosthesis for voice rehabilitation following total laryngectomy. They found that 29 subject out of 32 subjects could restore their voices [18]. Kazi et al., compared perceptual assessment of voice, acoustic parameters and quality of life of ten female and 10 male total laryngectomies to 10 normal female speakers. They concluded that all the acoustic parameters and GRBAS ratings of the female laryngectomy patients were significantly worse as compared with the normal subjects [13].
In another study by Deore N et al. researchers compared acoustic features of 30 post-laryngectomy patients who were used tracheoesophageal (TE) prosthetic valves. They reported poorer values as well as larger variability for all the voice parameters for the total laryngectomy patients using TE voice compared with those of normal subjects which emphasis on speech therapy programs [10]. In another study, 80% of patients reported good speech after using the prosthesis [19]. In a prospective nonrandomized cross-sectional study, researchers evaluated the outcome of voice rehabilitation in people who were used Provox Prosthesis using GRBAS perceptual assessment. 21 out of 30 subjects have good voice based on GRBAS scale after one month speech therapy sessions [1]. Although there are some studies that were used structural voice rehabilitation after radiotherapy for participants with laryngeal cancer, to the best of our knowledge any of mentioned studies do not report structural program for voice therapy.
In summary, investigations into the effectiveness of structured voice therapy in patients after laryngectomy and tracheoesophageal puncture are rare. So, in this prospective case study, multiple measures (acoustic, aerodynamic, and prosodic measures) were used to assess the changes of vocal function consequent to a structured voice therapy.
Participant
The subject was a seventy-year old man who was referred to an ENT specialist because of five months of hoarseness in his voice. He declared that he had smoked one pack (20 cigarettes) daily for the last fifty years. With stroboscopic assessment of vocal fold and biopsy, the patient was diagnosed with laryngeal cancer. As his laryngeal cancer was in the T4 stage, the otolaryngologist suggested total laryngectomy. In February 2017, the laryngectomy surgery and in July, the tracheoesophageal puncture was performed (secondary TE). Traditionally, a secondary procedure is selected for patients at higher risk of troubles, as it provides more time for enough healing of the laryngostoma before formation of the tracheoesophageal puncture [20]. At the same time, he underwent implantation of a Provox prosthesis.
Ten days after tracheoesophageal puncture and implantation of prosthesis, he still did not have the ability of phonation. So, voice therapy was suggested for him. Patient had good cognitive abilities and was able to complete questionnaires (personal information and voice related quality of life).
Voice therapy program A structured protocol, shown in table 1, was used to conduct the voice rehabilitation at the Imam Reza hospital, Mashhad, Iran. Voice rehabilitation was carried out by trained SLP in the research group. It comprised 8 specified voice rehabilitation sessions of 30 minutes each and consisted of relaxation, respiration, posture, and phonation training.
Treatment plan | Session number |
Basic exercises: relaxation, posture, and breathing. Emphasis on finding abdominal activity in breathing. Explanation of voice physiology. Beginning phonation with syllables which started with unvoiced fricatives (such as ha). | 1 |
Duplication of the first session, phonation which is expanded to voiced sounds and syllables. | 2 |
Repeat basic exercises, continue phonation exercises with repeated syllables (exe. baba), short words. Beginning of producing short phrases. | 3 |
Repeat and expand exercises in session 3. Intonation exercises and producing stressed syllables start. | 4 |
Phonation is accompanied by physical movements (like hand movement or walking). Longer phrases. | 5 |
Repetition of previous tasks. Put attention on words and phrases with different lengths and emphasize on appropriate resonance. Articulation exercises in order to produce relaxed speech. | 6 |
The use of learned techniques in reading, and conversation. Focusing on suitable pausing, eye contact with listener (pragmatic skills). | 7 |
Focusing on loudness as well as intelligible speech. | 8 |
Table 1. Descriptions of the Voice Rehabilitation Sessions [8, 21].
Data collection and measurements
All the assessments and analysis were done in the first and eighth sessions of therapy and 2 weeks after discharge (as follow-up session) except acoustic analysis which was performed in the third session of therapy (the time patient is about to be capable of sustaining /a/ vowel for 3 seconds at least) instead of first session.
Acoustic analysis
The record and acoustic voice analyses were carried out using the Praat program [22] at a sampling frequency of 44.1 kHz, and with the use of 3-second samples from the middle of the vowel /a/. The signal record was performed inside a soundproof room with a microphone (Sony ECM - MS907) 10cm away from the mouth.
From this recording, the following acoustic measures were completed: mean fundamental frequency (F0), shimmer, jitter, and harmonics-to-noise ratio (HNR):
Aerodynamic analysis
Maximum phonation time (MPT) is expressed as the longest period a person can maintain a vowel in one exhalation and the most commonly used vowel is /a/ [26].
Prosody of speech
The PEPS-C (Profiling Elements of Prosody in Speech-Communication) test is a prosody assessment procedure [27]. The Persian version is provided by Ghorbani, khoddami et al. [28]. In this test, both receptive and expressive abilities are examined in analogous tasks; however, only the expressive component was employed in the current study (table 2).
PEPS-C Subtest | Prosodic Skill Assessed | Example |
Expressive Affect
| Production of intonation indicating like or dislike at word level (1–2 syllables)
| Provided with a visual cue (e.g., picture of apple and a happy face). Response: Pronounce ‘apple’ with intonation indicating pleasure |
Expressive Contrastive Stress
| Production of intonation indicating emphatic stress at Phrase level (6–7 syllables)
| Auditory cue (e.g., ‘The BLUE sheep has the ball’) presented with a visual cue (e.g., a picture of a red sheep). Response: Participant is required to say “No, the RED sheep has the ball’ |
Expressive Chunking | Production of prosodic phrasing at phrase level (6–7 syllables)
| Presented with a visual cue (a picture of 3 nouns such as ‘fruit, salad, and cream’). Response: Pronounce the items from the picture in a phrase |
Expressive Turn-End | Production of intonation indicating a question or statement at word level (1–2 syllables) | Provided with a visual cue (e.g., picture of apple and a question mark). Response: Pronounce ‘apple’ with rising intonation |
Table 2. Expressive Prosodic Skills [29]
Voice related quality of life
The Voice related quality of life (V-RQOL) is a 10-item questionnaire designed for the patient to respond it. It quantifies the magnitude of voice related problems experienced by patients [30]. This questionnaire was created by Hogikyan and Sethuraman [31] and also, adapted in Persian [32].
The participant scores in each of the evaluated parameters present in figures 1-4. Point 1 for acoustic parameters (figure1) is third session, 2 is the eighth session, and 3 is two weeks after discharge (follow-up session). In other figures point is first session, 2 is the eighth session, and 3 is two weeks after discharge (follow-up session).
Figure 1: Analysis of acoustic features (fundamental frequency (Hz), jitter (%), shimmer (db) and harmonic to noise ratio (%)).
Figure 2: Analysis of expressive prosodic features.
Figure 3: Analysis of maximum phonation time.
Figure 4: Analysis of voice related quality of life (V-RQOL).
In this study, we examined one patient with a tracheoesophageal voice prosthesis. The patient’s voice during vowel /a/ phonation was recorded in the first and eighth sessions of voice therapy and two weeks afterwards. Subsequently, F0, jitter, shimmer and HNR were extracted, and the mean values were compared. MPT and V-RQOL scores were recorded before and immediately after termination of the voice therapy and additionally, after two weeks.
Acoustic parameters
In the first part, the mean F0 value was compared before and after treatment. F0 is generally related to the amount of elasticity and stiffness of vocal folds [33].
Arials et al. described that the fundamental frequency of patients who use voice prosthesis was lower than that of people with normal voice (97.59 vs. 120.30) [17]. The F0 value of the studied subject (270Hz) was higher than the mean F0 of people of the same age/sex (127 Hz) [34, 35]. After voice therapy sessions and in the follow-up session, a decreased fundamental frequency was observed which indicates getting closer to the mean of fundamental frequency in people with normal voice.
The second part of this study focused on the turbulence of frequency (jitter) and voice amplitude (shimmer). Shimmer is associated with the amplitude variation of the sound wave and appears to be a fundamental acoustic factor that is perceptually well sensed by listeners [36]. It proliferates with poor and inconstant interaction between the vocal fold edges [37]. In acoustical analysis of tracheoesophageal voice, higher shimmer values were found (38) and in Arias (2000) study, comparison of the group with normal voices and the phonatory prosthesis group, was noteworthy for shimmer [17]. These people use the esophagus for vocalization, and as a result, because the esophagus does not have the same elasticity as vocal folds, it does not make good contact, and as a result, the shimmer is higher than in people who use vocal folds for vocalization.
In our studied subject, shimmer was high (24dB) in the third session of therapy, but after the eighth session it lessened to 21.55dB. Although it shows the usefulness of the voice therapy process on tracheoesophageal prosthetic voice, it is higher than normal which is reported as 0.31 dB- 0.47dB [8]. But in the follow-up session, it increased expressing the need for more voice therapy sessions. Our study is in line with other research that reported a significant difference between shimmer in normal voice and voices of people who use prosthesis [10, 13]. For instance, Deore N et al reported shimmer of 6.77% in TEP patients compared to 0.95% in people with normal voice [10].
Jitter is concurrent with roughness [39]. The jitter in our participant was 4.9% in third session of voice therapy. After 8th session of voice therapy and also in follow-up sessions, the jitter was 6.21 and 6.47, respectively. Nevertheless, Arias (2000) reported jitter values of 3.96 percent in patients that received total laryngectomy who had a phonatory fistuloplasty with a Herrmann voice prosthesis [31]. Kazi et al and Deore et al reported this value as 5.9% and 2.18%, respectively. All these reports of jitter are higher than normal that is reported by Tuomi L, Björkner E, Finizia C [8].
The jitter is affected primarily by the lack of control of vocal folds’ vibration [23] Using esophasial instead of vocal folds can be an interpretation of higher jitter in people who use prosthesis to phonate. Also, considering the correspondence of jitter with roughness, which can be due to the efforts of the subject for phonation, we can determine the need for continuing voice therapy sessions to reduce or eliminate roughness, and gain more control over phonation.
HNR is used to assess the ratio between periodic and non-periodic components comprising a division of voiced speech [25]. In our study, the mean HNR for our patient was 1.4, 1.55, and 1.28 in the third, eighth, and follow-up sessions, respectively. In a similar study in which the average age of participants was 61.3 years, the authors found that the average HNR of tracheoesophageal speech is 4.28±3.83 [16]. In tracheoesophageal voice, there are more noisy components than in harmonic tones. As these patients don’t phonate with their larynx, it seems HNR is the main challenge for them.
Aerodynamic parameter
MPT has been used to objectify the degree of rigorousness of dysphonia and to designate the consequences of voice therapy [40]. laryngeal speakers have a longer MPT than patients with a Provox voice prosthesis [15]. Our studied participant sustained /a/ vowel just 1 sec in the third voice therapy session, which increased to 6 and 10 seconds in eighth and follow-up sessions respectively. Siric et al. reported mean MPT of 6.92±5.44 sec for the TE speakers [14]. Also, MPT was reported to be higher than 7s [1], 6.86s [10] and 8-28s [18] in TE speakers, which is in line with results of the current study. This can be explained by the fact that TE speakers have reduced breath support due to varying amounts of air leakage at the stoma occlusion. [10].
Prosodic features
As is reported in the literature, the quality of alaryngeal speech is low, with a loss of prosodic features [38, 41]. Also, Haderlein et al. reported that the prosodic features of tracheoesophageal speakers and laryngeal speakers were significantly different [42]. We examined Expressive Turn-End, Expressive Affect, Expressive Chunking and Expressive Contrastive Stress as part of expressive prosody using PEPS-C software. The results showed that these items improved after treatment and patient sustained this improvement in the follow-up session.
Voice related quality of life
V-RQOL is an eminent self-reported questionnaire which measures the effects of voice disorders on patients’ quality of life [30].
Decreasing the scores of V-RQOL in the first, the eighth and the follow-up sessions was an indicator of efficiency of voice therapy on improving the quality of life for patients with Provox prosthesis. In line with present study, Kazi et al reported a significant to severe voice handicap in 40% of female laryngectomy patients when compared with only 20% of male patients [13]. Van Gogh also reported an improvement on the Voice Handicap Index (VHI) in patients receiving voice therapy [40].
Conclusion
Our results indicate that structured voice rehabilitation following using Provox helps patients improve their communication abilities, especially in prosodic features, MPT, and quality of life. Therefore, treatment by structured voice therapy can be part of the voice rehabilitation program of people who have undergone a laryngectomy and use vocal prosthesis, similar to some other voice disorders in which this type of programs is being followed [8].
There is also potential for further investigations into structured voice rehabilitation for individuals who have had their larynx removed, as well as active participation and diligent efforts on the part of the surgeon, speech pathologist, and patients and their families.
Acknowledgment
This work was supported by Mashhad University of Medical Sciences (MUMS) [grant NO. 951193]. MUMS had no further role in study design, collection, analysis, and interpretation of data, writing of the report, and in the decision to submit the paper for publication.