How can vocal folds oscillate with a limited mucosal wave?

Self-sustained vocal fold vibration is possible with either or both of two mechanisms: (1) a mucosal wave propagating along the medial surface of the vocal folds and (2) a vocal tract that offers inertive reactance. A quantitative comparison shows the mucosal wave mechanism has a lower threshold pressure and a higher glottal efficiency, but the supraglottal inertance mechanism can assist in the oscillation and is effective in optimizing the two mechanisms. It is concluded that optimal parameters are a mucosal wave velocity on the order of 1 m/s and a diameter of the larynx canal (epilarynx tube) on the order of 0.8 cm.


Introduction
This express letter contains no new theories of vocal fold oscillation. Rather, it is intended for recognition and conceptualization of two independent mechanisms of self-sustained oscillation that are not often compared side-by-side. The simplest possible computational model is chosen for the phenomena of interest, one in which all the mathematical detail is provided with introductory-level aerodynamics and acoustics. All equations are ordinary differential equations. The model does not cover a broad range of frequency, intensity, or voice quality, nor is it intended for application to specific voice disorders. The primary purpose is to raise awareness about source-airway interaction. The paper should be useful to newcomers to the field, and it might be useful to clinicians who contemplate reconstructive surgery or behavioral intervention. Two fundamentally different mechanisms of self-sustained oscillation of the vocal folds have been described mathematically for self-oscillating valves in airways (Titze, 1988;Fletcher, 1993). One is independent of the airway, while the other is critically dependent on the airway. Here, we will refer to them as the mucosal wave (MW) mechanism and the supraglottal inertance (SI) mechanism. They usually co-exist, but one or the other may be dominant. The MW mechanism requires an upward-moving MW on the medial surface of the vocal folds (Titze, 1988;Lucero et al., 2011). The surface wave must propagate slowly enough so that inferior and superior portions of the vocal fold move out of phase (a half-wavelength mode pattern). This out of phase tissue movement produces alternating convergent-divergent glottal shapes, for which the glottal pressures offer an asymmetric force that is favorable for net energy transfer from the airstream to the vocal folds (Thomson et al., 2005;Li et al., 2006;Motie-Shirazi et al., 2021). If this energy transfer is sufficient to overcome the viscous losses in vocal fold movement and deformation (including collision), self-sustained oscillation is achievable. However, if the mucosal tissue is not sufficiently pliable to support a slow-moving surface wave, the energy transfer diminishes, and self-sustained oscillation is more difficult to achieve. Most clinical approaches are focused on preserving or restoring the MW mechanism (e.g., Khosla et al., 2008).
The SI mechanism, which can exist on its own or co-exist with the MW mechanism, involves the acoustic interaction of the airway with the vocal folds. It was first quantified with a computational model by Flanagan and Landgraf (1968). Synchronization is required between acoustic pressures in the vocal tract and surface pressures on the vocal folds. Much less is known about this mechanism in basic voice physiology than the MW mechanism, but it is well described in theories of oscillating valves and wind-instrument acoustics (Fletcher, 1993). It requires the airway of the vocal tract above the glottis to be acoustically inertive (air movement lagging in response to applied supraglottal pressure). In both mechanisms, a "strong push-weak push" alternates on the medial surface of the vocal folds to transfer energy from the airstream to the tissue.
For phonosurgery and voice therapy, a relevant question becomes whether or not airway modifications can be used to augment, or be a substitute for, un-achievable vocal fold tissue repair. The label "supraglottal hyperfunction" has been attached to compensatory behaviors that patients often exhibit when vocal fold tissues do not vibrate normally with a) Author to whom correspondence should be addressed. the MW mechanism. Increasing vocal tract inertance requires narrowing a portion of the vocal tract, which may be interpreted as being similar to hyper-adduction of the vocal folds. Galindo et al. (2017) addressed the possibility of phonotrauma with source-filter interaction that included the larynx canal. Zhang (2021) showed that vocal fold contact pressures are affected by the epilaryngeal configuration. In a beneficial direction, Yanagisawa et al. (1989) showed that an aryepiglottic constriction contributed to a desirable ringing voice quality in operatic singing. D€ ollinger et al. (2006) and D€ ollinger et al. (2012) showed that vocal fold dynamics are influenced by the epilaryngeal (larynx canal) area. Kniesburges et al. (2017) have shown that the ventricular folds affect oscillation conditions. It will be shown here that optimizing the larynx canal diameter can facilitate self-sustained oscillation.
The limitations of this brief analysis are significant. A low-frequency analysis is chosen that contains no wave propagation and, therefore, no vowel effects. Concatenation of resistances and inertances of various airway sections is only an approximation. There is also no broad exploration with tissue morphology and lung pressure. These choices are deliberate, however, to focus on the nature of the self-oscillation phenomena with as few variables as possible.

Theoretical background
For vocal fold vibration, a MW has been described with a medial surface displacement n that varies in the vertical (z) direction as a traveling wave (Titze, 1988), where v m is the MW velocity. It was then shown that by conducting a Taylor series expansion around a center point of the medial surface, the lower and upper margin displacements can be written as where n is the center vocal fold tissue displacement, and v g is the center vocal fold tissue velocity dn/dt. The variable s is a delay time, defined in terms of the vocal fold thickness T and the MW velocity v m as Two velocities, a lateral tissue velocity v g and a vertical wave velocity v m , define the upper and lower surface movement of the vocal folds. Hirano et al. (1981) have measured v m to be on the order of 1 m/s. It is largely determined by the elastic shear properties of the vocal fold mucosa. Without the vertical time delay s, there can be no alternating convergent and divergent glottal shapes. The time delay produces strong driving pressures for glottal opening and weak driving pressures for glottal closing (Li et al., 2006). Hence, s is the key variable for the MW mechanism of self-sustained oscillation. No net energy can be imparted to the vocal folds by the glottal airflow with s ¼ 0 because the driving pressures in the opening and closing phases cancel each other. The SI mechanism of self-sustained oscillation is based on the discovery that an inertive air column above the vocal folds produces a positive pressure if glottal flow is increasing and a negative pressure if glottal flow is decreasing, which is expressed mathematically as where I is a quantity known as inertance, and U is airflow into the vocal tract. Note that the rate of flow dU/dt is positive if flow is increasing (glottal opening) and negative if glottal flow is decreasing (glottal closing). This supraglottal pressure can transfer into the glottis and produce a "strong push-weak push" on the vocal folds for self-sustained oscillation.
3. Methods Figure 1 shows an airflow and pressure diagram of the airway system for low frequencies, for which air compressibility is negligible. Wave propagation and acoustic vocal tract resonances are therefore neglected for frequencies on the order of 100-200 Hz. Incompressible airflow with acceleration and deceleration are included, however, because they have a major effect on vocal fold oscillation. Consider L to be the length of the vocal folds, T the thickness, M the vocal fold mass, K the stiffness, B the damping coefficient (with a damping ratio 0.1), and P g the mean surface driving pressure. Pressures, resistances, inertances, and flows are as indicated in Fig. 1. The ordinary differential equations for the circuit are where P L is the lung pressure, P tg is the transglottal pressure, and P o is the oral pressure. The remaining quantities are resistances and inertances to be described below. Equation (7) is a second-order differential equation, which can be broken ARTICLE asa.scitation.org/journal/jel up into two first-order equations. Then the four first-order equations are solved with a fourth-order Runge-Kutta method. The independent variable is time, and the four dependent variables are vocal fold center displacement n, vocal fold center velocity v g ¼ dn/dt, glottal entry airflow U 1 , and radiation airflow U r . A continuity equation for glottal exit flows is The lumped-element resistances for low frequency can all be computed with a formula developed by Smith and Titze (2016) for tubes of varying lengths and diameters, where Lt is the tube length in m, D is the tube diameter in m, and U is the airflow in liters/s. The first term is the kinetic component, and the second is the viscous component. The density and viscosity of air are numerically included in the coefficients. The resistance R is expressed in Pa per liters/s, with a mean accuracy of 66% according to the authors. The lumped-element inertances for tubes have a simpler relation, where q is the density of air, L is the length, and A is the cross-sectional area of an equivalent circular airway section (Titze and Palaparthi, 2016). With Eq. (10), the tracheal resistance R t , the larynx canal resistance R e (the subscript denoting epi-larynx), the supraglottal tract resistance R s , and the lip resistance R L can be computed. With Eq. (11), the corresponding circuit inertances I t , I e , I s , and I L can be computed. The dimensions are provided in Table 1. The radiation resistance and inertance are, respectively, and where c is the speed of sound, D L is the lip diameter, and A L is the lip area. As shown earlier in Eqs.
(2)-(4), the mucosal wave on the medial surface of the vocal folds produces a time advancement s at the lower margin of the vocal folds and an equivalent time delay s at the upper margin, such that the entry, exit, and center glottal areas are where x 01 and x 02 are the lower (entry) and upper (exit) pre-oscillation positions. The transglottal pressure is taken from an average glottal resistance of 4 kPa per liter/s measured on human subjects across gender, loudness, and adduction (Konnai et al., 2017). Expressed in cgs units, which are used for convenience in the calculations, this resistance becomes 40 dyn/cm 2 per cm 3 /s, so that Subglottal and supraglottal pressures are and Finally, the vocal fold driving pressure is computed for five different glottal conditions. Beginning with the Bernoulli energy equation for pressures upstream of a flow detachment point, the pressure is where P kd is the kinetic pressure (1/2 . v 2 ) at the detachment area A d (assumed to be at the center of the glottis for a divergent glottis and at glottal exit for a convergent glottis). P kd can further be expressed in terms of the transglottal pressure and a pressure recovery coefficient k e from detachment to glottal exit. In the equations to follow, A min is the minimum allowed glottal area, A e is the larynx canal (epilarynx) area, A g is the area at the center of the glottis, and P g is the vocal fold driving pressure for Eq. (2): For A g1 > A g2 and A g2 ! A min (glottis open and convergent), For A g1 A g2 and A g1 ! A min (glottis open and divergent), For A g1 > A min and A g2 < A min (glottis closed and convergent), For A g1 < A min and A g2 > A min (glottis closed and divergent), For A g1 < A min and A g2 < A min (glottis closed top and bottom), Table 1 shows the nominal values of the parameters in the equations above. The bolded values are the critical ones that were varied over a range. The last row in Table I shows the inertances in g/cm 4 calculated with Eq. (11). For the nominal 0.8 cm diameter and 2.5 cm length, the larynx canal (epilarynx tube) has the highest inertance. The trachea and the supraglottal tract have a slightly lower inertance due to their larger diameters, but greater lengths nearly equalize the inertances. The radiation inertance is an order of magnitude lower. The resistances are not tabulated because they are all airflow-dependent.

Results
Two parameters were varied over a wide range in the model, the larynx canal diameter D e and the MW velocity v m . These two distinguish the SI mechanism from the MW mechanism. The adduction parameters x 01 and x 02 were chosen to be 0.04 cm to approximate typical airflow rates for the glottal resistance and the larynx canal resistance (Titze, 2021;Zhang, 2021). In the current analytical model, the dynamically varying glottal resistance is also determined by glottal airflow [Eq. (17)].

SI mechanism
To test the SI mechanism, 25 variations of larynx canal diameter D e were produced while the MW velocity was held at a high value of 12 m/s. This wave velocity produced minimal phase delay between upper and lower edge movement of the vocal folds, as seen in the upper left graph of the left panels in Fig. 2. The difference between the lower and upper glottal areas, A g1 À A g2 , is plotted. This difference was never greater than 0.05 cm 2 in either part of the glottal cycle (positive or ARTICLE asa.scitation.org/journal/jel negative). With a 0.6 cm vocal fold thickness, the convergence-divergence angle was less than 4.8 . According to Eq. (4), the time delay was 0.25 ms. With a fundamental frequency of vibration (K/M) 1/2 /2p ¼ 159 Hz, the fundamental period was 6.29 ms, resulting in a phase delay of 0.25/6.29 of a period, or 14 . This is small compared to the 180 top-to-bottom (or 90 middle-to-top) phase delay often observed in vocal fold vibration. Note that the oscillation was slow to reach a steady state with vocal fold collision. It took about 150 ms. Voice onset time is a measure of "ease of phonation." If it takes more than a few cycles, the lung pressure is only slightly above threshold. Pressures greater than 1.0 kPa did reduce the onset time, but they are not plotted here due to space limitations. In the remaining three graphs of the left four panels, it is seen that airflow peaked at about 0.3 liters/s, vocal tract input pressure oscillated between about À0.1 and þ0.1 kPa, and oral pressure was about 20 times lower than vocal tract input pressure. This result is due to losses along the airway. The signals are all nearly sinusoidal because there is no wave propagation (and hence no vocal tract resonance) for this lowfrequency analysis with incompressible airflow.
The right four panels of Fig. 2 show post hoc calculations for a group of 25 simulations in which the larynx canal diameter D e was varied from 0.4 to 1.2 cm, a range that bracketed cases where glottal closure occurred (D e ¼ 0.5-0.8 cm). The MW velocity was kept constant at the high value of 12 m/s. Very small larynx canal diameters (D e 0.3 cm) did not produce self-sustained oscillation with the 1.0 kPa lung pressure. On the other extreme, diameters larger than 0.8 cm also required larger than 1.0 kPa lung pressure. Data circles on the graphs indicate where the waveforms on the left panels were selected. As seen in the figure, mean glottal airflow was lowest in the region where collision occurred, while oscillating glottal airflow, oscillating oral pressure, and glottal efficiency all increased slightly with D e . Glottal efficiency reached a peak at 0.8 cm and then declined with declining oscillating pressures. This efficiency was calculated as the ratio of RMS output power P 2 o /R r divided by the aerodynamic power P L U 1 available from the lungs. An important result is that glottal efficiency can be optimized with a mid-range larynx canal (epilarynx) diameter. For smaller diameters, the airway resistances (and corresponding energy losses) are too high, while for large diameters, the mean airflow is too high and the inertances are too low.

MW mechanism
The second experiment involved v m as the parameter, the MW velocity. This parameter was more effective in producing self-sustained oscillation because the phonation threshold pressure was lower. As Fig. 3 (left four panels) shows, onset time to vocal fold collision was only about 20 ms, or about 5 cycles. The larynx canal diameter was held steady at a large value of 2.0 cm, while the MW velocity was a mid-range value of 2.0 m/s. The scale of the waveforms was kept the same as in Fig. 2. The most striking difference is the (A g1 À A g2 ) waveform in the top left graph of the left panels. This indicates that alternating convergent-divergent shapes are present in equal amounts. With a 2 m/s MW velocity, the time delay between top and bottom was 0.75 ms, the phase delay was 42 , and the convergence-divergence angle was 14.4 . By reducing the wave velocity to 0.5 m/s, the phase delay becomes 168 , the convergence-divergence angle becomes 57.6 , and the (A g1 À A g2 ) waveform amplitude grows by a factor of 4, off the scale in Fig. 3.
The right four panels of Fig. 3 show post hoc calculations on 25 simulations with the MW velocity v m as the variable. Data circles show where the waveforms on the left panels were selected. Note the general decline of glottal efficiency with increasing MW velocity. The decline is steepest in the 0.5-2.0 m/s range. However, a broad range of MW velocities can produce self-sustained oscillation. No oscillation was achieved with values greater than 8 m/s, but different parameter sets (increased or decreased adduction, mass, stiffness, lung pressure) will likely produce different oscillation regions. In contrast to the SI mechanism, there is no mid-range optimum wave velocity. According to Eq. (4), the minimum v m for a half-period (180 ) bottom-top delay can be related to the vocal fold thickness T and the fundamental frequency

Discussion and conclusions
Both the MW mechanism and the SI mechanism allow an alternating "strong push-weak push" on the vocal fold surfaces for self-sustained oscillation. The two usually co-exist in normal voice production, as was shown here, but the MW mechanism is dominant for normal tissue conditions. Nature has provided a soft, gel-like mucosa for surface wave motion that allows alternating convergent-divergent glottal shapes for low phonation pressure and "ease of phonation." A MW velocity in the range of 0.5-4 m/s provides appropriate convergence-divergence angles, with the value f o T (fundamental frequency in Hz times vocal fold thickness in m) being the ideal wave velocity. Higher wave velocities limit the effectiveness of the MW mechanism. Nature has also provided a narrow larynx canal for favorable supraglottal source-airway interaction. A larynx canal diameter around the value 0.8 cm appears optimal. The equivalent cross-sectional circular area is 0.5 cm 2 , a typical value reported by Story et al. (1996) on normal subjects. It is not entirely clear to what degree the laryngo-pharyngeal musculature can actively alter the diameter without compromising breathing and swallowing, but the epiglottis can move in the anterior-posterior direction, and its geometry can perhaps be rounded into an omega-shape. The false folds can also be adducted, but that may produce undesirable false-fold oscillation. Much more work is needed to optimize the overall airway configuration. The current simplified model may pave the way for more sophisticated finite-element modeling with large sets of parameter variations.