Cubic and quadratic distortion products in vibrations of the mouse cochlear apex

When the ear is stimulated by two tones presented at frequencies f1 and f2, nonlinearity in the cochlea's vibratory response leads to the generation of distortion products (DPs), with the cubic 2f1–f2 DP commonly viewed as the most prominent. While the quadratic f2–f1 DP is also evident in numerous physiological and perceptual studies, its presence in the cochlea's mechanical response has been less well documented. Here, examination of vibratory DPs within the mouse cochlea confirmed that f2–f1 was a significant and sometimes dominant component, whether DPs were measured near their generation site, or after having propagated from more basal locations.


Introduction
In the mammalian cochlea, sound-evoked waves traveling along the basilar membrane (BM) are actively amplified by the outer hair cells (OHCs) within the organ of Corti [ Fig. 1(a)]. The amplification process is highly nonlinear, resulting in phenomena like compression, suppression, and distortion in the cochlea's mechanics (Robles and Ruggero, 2001). For instance, in response to two stimulus tones at frequencies f 1 and f 2 (f 2 > f 1 ), the cochlea generates significant intermodulation distortion products (DPs) at frequencies such as the "cubic" difference tone 2f 1 -f 2 and the "quadratic" difference tone f 2 -f 1 . These distortions shape the input to the inner hair cells, and thus, the responses of the afferent auditory nerve, and are ultimately perceived (Goldstein, 1967;Humes, 1980;Plomp, 1965). However, the precise nature of the underlying nonlinearity and how it influences the cochlea's output remain incompletely understood.
Nonlinearity in cochlear mechanics is largely thought to be due to the saturating, sigmoidal relationship between deflection of the OHC's stereociliary bundle and the currents that flow through mechanically gated channels located near the stereocilia's tips (Avan et al., 2013) [ Fig. 1(b)]. Transduction currents produce the variations in membrane potential (i.e., the receptor potential) that drive electromotile force generation by the OHCs (Brownell et al., 1985;Santos-Sacchi and Dilger, 1988). The resting position of the bundle, or its operating point (OP), is of primary interest as it determines the amplificatory gain provided by the OHC, as well as the relative magnitudes of any even-and odd-order DPs (e.g., f 2 -f 1 and 2f 1 -f 2 , respectively) in its motile response. If the OP is near the center of the function, where the gain is highest, two-tone stimulation elicits symmetric currents that primarily contain odd-order DPs [Fig. 1(c)]. Any bias away from the center results in asymmetric currents and the presence of even-order components.
Though 2f 1 -f 2 is the most readily perceived DP (Goldstein, 1967) and is generally the largest DP emitted to the ear canal, sizable responses at f 2 -f 1 have been observed in auditory nerve fiber recordings (Kim et al., 1980) and intracochlear or intracellular potentials (Cheatham and Dallos, 1997;Gibian and Kim, 1982;Nuttall and Dolan, 1993). However, the presence and relative magnitude of f 2 -f 1 in the cochlea's mechanics have been less characterized. While f 2 -f 1 is small or absent in BM vibrations measured from the cochlear base in guinea pig and chinchilla (Nuttall and Dolan, 1993;Rhode, 2007;Robles et al., 1997), it has been observed in vibrations of the tectorial membrane (TM) in apical, low-frequency regions (Cooper and Rhode, 1997). Recent measurements from the gerbil base using low-coherence heterodyne interferometry (Ren and He, 2020) and optical coherence tomography (OCT; Vavakou et al., 2019) have also found f 2 -f 1 in vibrations of the OHC region-the presumed source of the DPs-although they appear to be smaller than 2f 1 -f 2 (Burwood et al., 2022).
Here, OCT was used to compare f 2 -f 1 and 2f 1 -f 2 DPs in vibrations from the 9 kHz location in the mouse cochlear apex. DPs were characterized both locally near where they are generated, including within the OHC region, and after having propagated from more basal generation sites.

Methods
Measurements were obtained from 11 adult (4-7 week-old) CBA/CaJ mice (five female) using a custom-built, swept-source OCT system and methods largely described in Dewey et al. (2021). All procedures were approved by the University of Southern California's Institutional Animal Care and Use Committee.
Mice were anesthetized (80-100 mg/kg ketamine; 5-10 mg/kg xylazine), placed on a heating pad (38 C), and fixed to a head-holder. An otoacoustic emission probe (ER-10X; Etymotic Research, Elk Grove, IL) was sealed over the resected ear canal to present acoustic stimuli. Stimulus levels were calibrated using the pressure measured by the probe, which was corrected for the probe's frequency-dependent sensitivity.
After surgically accessing the left middle ear space, the OCT light source was scanned across the cochlea to obtain two-dimensional cross-sectional images of the apical turn. Vibratory responses to single-and two-tone stimuli were then obtained from the OHC region (close to the DCs), BM, and/or TM, with responses sampled at 100 kHz. Stimuli were 102 ms tones (with 1 ms ramps) presented 8-32 times with a $7 ms interstimulus interval. Single-tone responses were used to determine the measurement site's characteristic frequency (CF), which was defined as the frequency eliciting the largest BM displacement for tones presented at 30 dB sound pressure level (SPL). Measurements were only obtained from locations with a CF of 9 kHz. Various two-tone paradigms for measuring DPs are described in Sec. 3.
After the desired measurements were performed, mice were euthanized by anesthetic overdose. Certain measurements were repeated postmortem to verify the physiological origin of the DPs, which were greatly reduced or absent after death. Acoustic distortion was sometimes still detected in the ear canal at high stimulus levels, though it was at least 60-70 dB lower than the stimulus levels. Any acoustic distortion capable of eliciting a displacement as large as that observed in the in vivo measurements was considered problematic, and data collected using such stimulus conditions were not included in any analyses or plots.
Magnitudes and phases of responses at f 1 , f 2 , f 2 -f 1 , and 2f 1 -f 2 were obtained by applying a fast Fourier transform to the steady-state portion of the average displacement waveform. Reported displacement magnitudes are root mean square values and phases of the acoustic stimuli have been subtracted from the displacement phases. For f 2 -f 1 and 2f 1 -f 2 DPs, this involved subtracting u 2ec -u 1ec and 2u 1ec -u 2ec, respectively, where u 1ec and u 2ec were the phases at f 1 and f 2 in the ear canal. Noise floors for each response component were calculated as the mean þ 3 standard deviations of the displacement magnitudes within 220-320 Hz (for f 1 and f 2 ) or 20-120 Hz (for the DPs) of the response frequency. Unless noted otherwise, only data with magnitudes exceeding the noise floor are plotted, and averages are only shown when such data were available from at least three mice. All individual data are accessible in an online repository. 1

Results
OCT was used to image the mouse cochlear apex [ Fig. 2(a)] and measure vibratory responses to single-and two-tone stimuli from the OHC region, BM, and TM. Responses to single tones were tuned to a CF of 9 kHz and exhibited increasing phase lags with frequency that are indicative of traveling wave propagation [ Fig. 2(b) and 2(c)]. As shown previously (Dewey et al., 2021), the OHC region was more responsive to low frequencies compared to the BM and TM, which were more sharply tuned. The low-pass nature of the OHC region motion likely reflects the more direct influence of electromotility, which is thought to inherit a low-pass characteristic due to filtering of the receptor potential by the OHC's electrical properties (Santos-Sacchi, 1989;Vavakou et al., 2019).
After determining the site's CF, OHC region responses to two-tone stimuli were obtained with f 2 fixed at the CF and f 1 varied to achieve f 2 /f 1 ratios of $1.07-1.67 in 0.1 steps. As shown in Fig. 2(d), OHC region displacement spectra revealed numerous DPs, most prominent typically being f 2 -f 1 , followed by 2f 1 -f 2 . The presence and relative magnitude of IHC, inner hair cell; RM, Reissner's membrane). (b) First-order Boltzmann function used to approximate the nonlinear relationship between OHC stereociliary bundle displacement and transduction current. For a given displacement x, the output current I is given by I where a 1 determines the slope (here, a 1 ¼ 0.28 nm À1 ) and x 1 sets the OP. Waveforms for a two-tone input and the output when the OP is at the center of the function (x 1 ¼ 0) are shown. (c) Spectra of the function's two-tone input and its output when the OP is centered (i), resulting in a symmetric output and only odd-order DPs (e.g., 2f 1 -f 2 and 2f 2 -f 1 ), or uncentered (ii; x 1 ¼ 2.6 nm), resulting in an asymmetric output and additional even-order DPs (e.g., f 2 -f 1 ).
f 2 -f 1 were therefore consistent with the output of a Boltzmann function with an OP positioned away from the function's center [e.g., Fig. 1(c)].
To better characterize the underlying nonlinearity, measurements were made with L 2 fixed at 60 dB SPL and L 1 varied from 20 to 85 dB SPL [Fig. 2(e) and 2(f)]. For both small and large f 2 /f 1 ratios [Figs. 2(e) and 2(f), respectively], OHC region displacements at f 2 -f 1 and 2f 1 -f 2 exhibited nonmonotonic growth patterns that were tied to the magnitudes of the responses at f 1 and f 2 . For L 1 values where the response at f 1 remained smaller than the response to f 2 , the f 2 -f 1 DP was larger than 2f 1 -f 2 but grew less steeply with L 1 (at a rate of $1 dB/dB, compared to $2 dB/dB for 2f 1 -f 2 ). The different growth rates were consistent with the output of a power-law nonlinearity (e.g., Humes, 1980). For small ratios, the f 2 -f 1 DP could even be as large as the f 1 response, despite f 2 -f 1 being far below the CF (e.g., f 2 -f 1 ¼ 0.59 kHz when f 2 /f 1 ¼ 1.07). As L 1 was increased so that the response at f 1 approached and then exceeded that at f 2 (which became suppressed by the f 1 response), both f 2 -f 1 and 2f 1 -f 2 DPs peaked and then rapidly declined. Because the 2f 1 -f 2 DP started to decline at slightly higher L 1 values, it typically became larger than the response at f 2 -f 1 as L 1 was increased further. DP phases were relatively constant for L 1 < 60 dB SPL but could shift by up to 0.25 cycles at higher levels. The magnitude and direction of these shifts were predictable from changes in the phases of the f 1 and f 2 responses. Specifically, they were consistent with the phases of f 2 -f 1 and 2f 1 -f 2 being u 2 -u 1 and 2u 1 -u 2 (plus some constant), where u 1 and u 2 are the f 1 and f 2 response phases.
While these magnitude and phase patterns may appear complex, they were replicated by the output of the Boltzmann function shown in Fig. 1(b) when the OP was uncentered (x 1 ¼ 2.6 nm). Figure 2(g) shows the Boltzmann's output for f 2 /f 1 ¼ 1.57 when using BM displacements at f 1 and f 2 as the function's inputs (measured using the same stimulus paradigm and averaged from five mice). The Boltzmann's output was low-pass filtered (first-order, corner frequency ¼ 1.75 kHz) in order to approximate filtering of the OHC receptor potential. Such filtering was previously found necessary to account for the relative magnitudes of harmonic and tonic distortions in single-tone responses (Dewey et al., 2021), and can explain the large f 2 -f 1 DP magnitude at small f 2 /f 1 ratios, where f 2 -f 1 falls below the corner frequency. , with responses at f 1 , f 2 , f 2 -f 1 , and 2f 1-f 2 indicated. (e) and (f) Magnitudes and phases of representative OHC region displacements as a function of L 1 (with L 2 ¼ 60 dB SPL) for two f 2 /f 1 ratios and f 2 ¼ 9 kHz. Phases were referenced to the median phase of the f 2 response for L 1 < 40 dB SPL. Due to the higher measurement noise at low frequencies, lower-frequency DPs only became detectable when they were large (e.g., for f 2 /f 1 ¼ 1.07, f 2 -f 1 ¼ 0.59 kHz while 2f 1 -f 2 ¼ 7.82 kHz; in contrast, for f 2 /f 1 ¼ 1.57, f 2 -f 1 ¼ 3.27 kHz and 2f 1 -f 2 ¼ 2.46 kHz). (g) Modeled responses for f 2 /f 1 ¼ 1.57 using the Boltzmann function shown in Fig. 1(b) with uncentered OP (see main text). ARTICLE asa.scitation.org/journal/jel While DPs in the Boltzmann's output also exhibited level-dependent phase shifts, the absolute phases of the modeled and measured DPs differed somewhat. This is not surprising, as vibratory phases change rapidly within the OHC region (Dewey et al., 2021) and are undoubtedly influenced by mechanical properties not included in the Boltzmann model.
The level-dependent growth of OHC region DPs was further explored using equal-level stimuli, as shown for small and large f 2 /f 1 ratios in Fig. 3(a) and 3(b). With L 1 ¼ L 2 , DPs generally grew less steeply with increasing level compared to when L 2 was fixed and L 1 was varied. When averaged across all f 2 /f 1 ratios and mice, and evaluated for L 1 ¼ 40-55 dB SPL, growth rates for f 2 -f 1 were an average (6 standard error, SE) of 0.76 6 0.06 (n ¼ 6) and 0.92 6 0.02 (n ¼ 7) dB/dB for equal-level and fixed-L 2 paradigms, respectively. For 2f 1 -f 2 , these rates were 0.96 6 0.05 and 1.89 6 0.05 dB/dB. The lower growth rates for the equal-level paradigm can be attributed to the compressive growth of responses at both f 1 and f 2 when L 1 and L 2 are covaried. The behavior of the DPs was otherwise similar between paradigms, with the 2f 1 -f 2 DP growing more steeply than f 2 -f 1 and becoming larger only when the f 1 response exceeded the f 2 response.
Equal-level stimuli yielded measurable DPs over a wide range of stimulus levels and were therefore also used to examine DPs in vibrations of the BM and TM [Figs. 3(c)-3(f)]. DPs were measurable from both structures though were typically much smaller than the OHC region DPs (by $10-20 dB and $5-10 dB for the BM and TM, respectively). The relative magnitudes of f 2 -f 1 and 2f 1 -f 2 DPs in BM and TM vibrations also depended strongly on the f 2 /f 1 ratio, with f 2 -f 1 being particularly reduced at small ratios [Figs. 3(c)-3(e)]. At these ratios, f 2 -f 1 becomes very low in frequency while 2f 1 -f 2 approaches the CF. The relative DP magnitudes therefore appear to be shaped by the frequency responses of the BM and TM, which are both sharply tuned to the CF. For f 2 /f 1 ratios > 1. 5 [e.g.,Figs. 3(d) and 3(f)], f 2 -f 1 is higher in frequency than 2f 1 -f 2 and therefore does not suffer from this relative attenuation, explaining why it remained the dominant DP on the BM and TM.
Both DPs were also measurable at the 9 kHz location after having been generated at more basal sites and then propagated apically. Figure 4(a) shows TM responses obtained with f 2 varied from $2-40 kHz, f 2 /f 1 ¼ 1.57, and L 1 ¼ L 2 ¼ 70 dB SPL, plotted vs the f 2 frequency. DP magnitudes peaked when f 2 was near the CF, where there was maximal interaction between the responses at f 1 and f 2 , as well as when the DP frequency fell near the CF (see arrows). This occurred when there was little local interaction between the responses at f 1 and f 2 , which peaked at more basal sites. The measured DPs therefore presumably originated at these sites and propagated to the 9 kHz location. When plotted vs their own frequency, DP magnitudes and phases for frequencies > 6 kHz resembled those of responses to single tones presented at 20 dB SPL [Figs. 4(b) and 4(c)]. Phases of lower-frequency DPs were more complex, possibly indicating the presence of both locally generated and apical-or basal-propagating components (Dong and Olson, 2008).
For large f 2 /f 1 ratios, propagated f 2 -f 1 DPs were often greater in magnitude than 2f 1 -f 2 DPs, particularly for lower-level stimuli. With f 2 /f 1 ¼ 1.57 and stimuli presented at 60 dB SPL, propagated f 2 -f 1 and 2f 1 -f 2 DP magnitudes on the TM were on average (6 SE) 0.36 6 0.06 nm and 0.18 6 0.05 nm, respectively (n ¼ 5), equivalent to displacements (a-f) Average (n ¼ 6) magnitudes of OHC region (a, b), BM (c, d), and TM (e, f) displacements at f 1 , f 2 , f 2 -f 1 , and 2f 1 -f 2 as a function of stimulus level, with f 2 ¼ 9 kHz and f 2 /f 1 ¼ 1.07 (a, c, e) or 1.57 (b, d, f). Error bars indicate 1 SE and are often smaller than the symbols. elicited by a 9 kHz tone at $11 and 6 dB SPL. However, these comparisons are complicated by the fact that higher f 2 frequencies were required to generate 2f 1 -f 2 at 9 kHz. Cochlear sensitivity, nonlinearity, and responsiveness to force generation at 9 kHz may all vary with location, potentially contributing to the different DP magnitudes.
In an alternative paradigm, f 2 was varied from $10-32 kHz and f 1 set so that either f 2 -f 1 or 2f 1 -f 2 was always equal to 9 kHz. Propagated f 2 -f 1 and 2f 1 -f 2 DPs were therefore presumed to originate from a similar generation site as f 2 was varied. The two DPs were characterized in separate measurements, as they required different f 1 values. Figures 4(d)-4(e) show propagated DP magnitudes measured on the BM for two f 2 frequencies, with L 2 ¼ 60 dB SPL and L 1 varied. Propagated DPs exhibited characteristics observed in the locally generated DPs, with f 2 -f 1 being detectable at lower stimulus levels and growing less steeply compared to 2f 1 -f 2 , which became dominant at high L 1 values. However, the growth of the propagated DPs tended to be less steep than that observed for locally generated DPs. For L 1 ¼ 40-55 dB SPL, average (6 SE) growth rates for f 2 -f 1 and 2f 1 -f 2 were 0.59 6 0.04 and 1.40 6 0.02 dB/dB across all frequencies and mice (n ¼ 5). Response phases were usually stable for L 1 < 60 dB SPL, with phase shifts occurring at higher levels, sometimes accompanied by amplitude notches [Fig. 4(e)]. This behavior could be a feature of the nonlinearity at the more basal sites (Lukashkin et al., 2002), or else could arise from interference between DPs originating from different locations. Such interference may also explain the shallower growth rates of the propagated DPs.
Figure 4(f) shows the average maximum propagated DP amplitudes observed on the BM as a function of f 2 , highlighting that, while f 2 -f 1 emerged at lower stimulus levels, 2f 1 -f 2 always became dominant at higher levels for this measurement paradigm. Maximum f 2 -f 1 and 2f 1 -f 2 DP magnitudes were equivalent to responses elicited by 9 kHz tones presented at $24 and 43 dB SPL, respectively. Though the dominance of the 2f 1 -f 2 DP at high stimulus levels could be partly due to its relative growth pattern at the generation site [e.g., Fig. 2], lower f 2 -f 1 magnitudes may also be attributed to the much wider f 2 /f 1 ratios required to elicit this DP (ranging from $10 to 1.4 with increasing f 2 , compared to a range of 1.07 to 1.6 for 2f 1 -f 2 ), and greater suppression by the f 1 tone, which more strongly stimulated the measurement site during recordings of f 2 -f 1 . The relative levels of the propagated DPs therefore strongly depend on the stimulus paradigm and are likely influenced by factors other than the nonlinearity at the generation site.
To assess the possible perceptual relevance of the propagated DPs, one must compare their magnitudes to the displacements elicited at the threshold of hearing. Behavioral hearing thresholds near 9 kHz in CBA/CaJ mice are $10-16 dB SPL (May et al., 2006;Radziwon et al., 2009), which would correspond to BM displacements of $0.14-0.3 nm (À17 to À10 dB re 1 nm) and TM displacements of $0.3-0.7 nm (À10 to À3 dB re 1 nm) at threshold. Thus, while differences in the presentation and calibration of acoustic stimuli in behavioral studies may complicate such comparisons, both f 2 -f 1 and 2f 1 -f 2 DPs appear large enough to at least be detectable, if not perceptually salient, over a range of stimulus levels. (d, e) BM response magnitudes and phases at f 2 -f 1 and 2f 1 -f 2 with f 2 and f 1 set so that the DP frequencies were equal to 9 kHz, with L 2 ¼ 60 dB SPL and L 1 varied. Data from one mouse are shown for two f 2 frequencies. (f) Average (n ¼ 5) maximum DP magnitudes obtained with the paradigm used in (d, e) as a function of f 2 . ARTICLE asa.scitation.org/journal/jel

Discussion
The present work demonstrates that f 2 -f 1 is a significant DP in vibratory responses of the mouse cochlea. The data confirm recent measurements of f 2 -f 1 DPs in motions of the OHC region in the gerbil base (Ren and He, 2020;Vavakou et al., 2019), but further show that these DPs are locally transmitted to both the BM and TM, and are measurable on these structures as they propagate apically. Characteristics of local and propagated DPs were highly similar to those observed in electrical recordings from the cochlear fluids, inner hair cells, and auditory nerve in other species (Cheatham and Dallos, 1997;Gibian and Kim, 1982;Kim et al., 1980). Previous findings of f 2 -f 1 DP being small or absent in BM vibrations in some of these species could be due to the choice of stimulus parameters, measurement sensitivity, or the relative lability of f 2 -f 1 (Cooper and Rhode, 1997).
The presence of even-order DPs indicates that an asymmetric output is produced by the underlying nonlinearity, which, for OHCs, is commonly attributed to the mechanotransducer function. This suggests that the stereociliary bundle's OP is not at the function's center, where the transducer gain is maximal, as is often claimed (Jeng et al., 2021;Russell et al., 1986). However, the bias need not be extreme, as OHC region DPs could be approximated using a Boltzmann function with OP set so that $33% of the maximum current is activated at rest. Of course, it is possible that the OHCs' mechanical nonlinearity has sources other than mechanotransduction (Santos-Sacchi, 1989). Anesthesia may also affect OHC function such that the relative levels of even-and odd-order DPs are not the same as in the awake state (Schlenther et al., 2014). Nevertheless, the fact that a simple Boltzmann model reproduced behaviors of both low-frequency intermodulation DPs and high-frequency harmonics (Dewey et al., 2021) suggests that it is a reasonable starting point for understanding OHC nonlinearity in mice.
Though the 2f 1 -f 2 DP became quite large at high stimulus levels, the dominance of f 2 -f 1 at low stimulus levels indicates that it may also impact perception. Indeed, its presence has been suggested to facilitate envelope encoding (Nuttall et al., 2018) and detection of high-frequency vocalizations in mice (Portfors et al., 2009). Interestingly, however, human psychophysical studies have typically found f 2 -f 1 to become audible only at high stimulus levels (Goldstein, 1967;Humes, 1980). Though the underlying mechanical nonlinearity may be species dependent, there could be multiple sources of nonlinearity (e.g., inner hair cell) and other central factors that contribute to differences between mechanical and perceived DPs. The choice of stimulus paradigm also dramatically affects DP magnitudes, potentially complicating comparisons between studies. Whether DPs play a significant perceptual role, or if they are simply an inconsequential by-product of cochlear nonlinearity, requires further examination in humans and other species.