Effects of hearing-aid dynamic range compression on spatial perception in a reverberant environment

This study investigated the effects of fast-acting hearing-aid compression on normal-hearing and hearing-impaired listeners’ spatial perception in a reverberant environment. Three compression schemes—independent compression at each ear, linked compression between the two ears, and “spatially ideal” compression operating solely on the dry source signal—were considered using virtualized speech and noise bursts. Listeners indicated the location and extent of their perceived sound images on the horizontal plane. Linear processing was considered as the reference condition. The results showed that both independent and linked compression resulted in more diffuse and broader sound images as well as internalization and image splits, whereby more image splits were reported for the noise bursts than for speech. Only the spatially ideal compression provided the listeners with a spatial percept similar to that obtained with linear processing. The same general pattern was observed for both listener groups. An analysis of the interaural coherence and direct-to-reverberant ratio suggested that the spatial distortions associated with independent and linked compression resulted from enhanced rever-berant energy. Thus, modiﬁcations of the relation between the direct and the reverberant sound should be avoided in ampliﬁcation strategies that attempt to preserve the natural sound scene while restoring loudness cues.


I. INTRODUCTION
Loudness recruitment is a typical consequence of sensorineural hearing loss (Fowler, 1936;Moore, 2004;Steinberg and Gardner, 1937).To compensate for recruitment and thereby restore the normal dynamic range of audibility, multi-band fast-acting dynamic range compression (DRC) algorithms for hearing aids have been developed (Allen, 1996;Villchur, 1973).DRC algorithms amplify soft sounds and provide progressively less amplification to sounds whose level exceeds a defined compression threshold (CT).In anechoic acoustic conditions, it has been shown that DRC systems that operate independently in the left and the right ear can lead to a distorted spatial perception of sounds, as reflected by an impaired lateralization performance, an increased sensation of diffuseness, as well as the perception of split sound images (Wiggins andSeeber, 2011, 2012).However, other studies conducted in anechoic acoustic conditions found only a minor effect of independent compression on sound localization (Keidser et al., 2006;Musa-Shufani et al., 2006).In the case of independent compression of the two ear signals, less amplification is typically provided to the ear that is closer to the sound source than to the ear that is farther away from the sound source, such that the intrinsic interaural level differences (ILDs) given by the acoustic shadow of the listener's head are reduced.Wiggins andSeeber (2011, 2012) ascribed the detrimental effects of independent compression on spatial perception to the mismatch between the reduced intrinsic ILDs and the unprocessed interaural time differences (ITDs) coming from a given sound source (see also Brown et al., 2016).
With the aim of preserving the naturally occurring ILDs, state-of-the-art bilaterally fitted hearing aids share the measured sound intensity information in one hearing aid with that in the other hearing aid via a wireless link.The ear signal with the higher sound intensity in a given acoustic sound source scenario is typically chosen as the one providing the input to the level-dependent gain function in both (left-ear and right-ear) DRC systems (Korhonen et al., 2015).For hearing-impaired listeners with a symmetrical hearing loss, this shared processing, often referred to as "synchronization" or "link," implies that the amplification provided by the two DRC systems is the same such that the intrinsic ILDs are preserved.For hearing-impaired listeners with an asymmetrical hearing loss with different prescribed DRC gain settings [i.e., gain levels in the linear region, CTs, and compression ratios (CRs)] for the left and right ear, the synchronization of the provided input level to the gain functions does not necessarily lead to a preservation of the intrinsic ILDs.
It has been demonstrated that linked fast-acting DRC systems, as compared to independent DRC systems, can improve speech intelligibility in the presence of a spatially separated stationary noise interferer for normal-hearing listeners in anechoic conditions (Wiggins and Seeber, 2013).In reverberant conditions, linked fast-acting DRC systems have been shown to improve the ability of normal-hearing listeners to attend to a desired target in an auditory scene with spatially separated maskers as compared to independent compression (Schwartz and Shinn-Cunningham, 2013).However, the effects of both independent and linked compression on more fundamental measures of spatial perception (such as distance, localization, and source width) in reverberant conditions have only received little attention.In particular, the effects of compression on the direct part of a sound as well as its early reflections and late reverberation in a given environment have not yet been examined.Catic et al. (2013) demonstrated that modifications of the interaural cues provided by the reverberation inside an enclosed space degrade the listeners' ability to perceive natural sounds as "externalized," i.e., as compact and properly localized both in direction and distance (Hartmann and Wittenberg, 1996).In a given reverberant environment, correct localization of an acoustic source is, among other factors, based on the interaural coherence (IC) between the listeners' ear signals (Catic et al., 2015), which is determined by the interaction between the direct sound and the reverberant part of the sound.
The hypothesis of the present study was that both independent as well as linked compression schemes affect the interaural cues provided by the reverberation, e.g., the IC and, thus, impair the spatial perception of the sound scene in a reverberant environment.In contrast, a compression scheme where the DRC operates on the "dry" source before its interaction with the reverberant environment, i.e., a "spatially ideal" DRC, should preserve the relation between the direct sound and the interaural cues provided by the reverberation and thus lead to robust spatial perception.To test this hypothesis, the effects of (fast-acting) independent, linked, and spatially ideal compression schemes on the spatial auditory perception in a reverberant environment were examined in a group of normal-hearing listeners and a group of sensorineural hearing-impaired listeners with a symmetrical hearing loss.Linear processing, i.e., level-independent amplification, was considered as a reference condition.The sounds in the different conditions were virtualized over headphones in a standard listening room using individual binaural room impulse responses (BRIRs).Listeners indicated their spatial perception graphically to capture all relevant spatial attributes with respect to distance, azimuth localization, source width, and the occurrence of split images.The deviations of the listeners' ratings in the different compression conditions from those in the reference condition were considered to reflect the amount of spatial distortion.Transient sounds as well as speech were used as test stimuli to investigate the effects of the compression schemes on both the direct sound and the reverberant part of the sound.To quantify the distortion of the spatial cues in the different conditions, the IC and the direct-to-reverberant energy ratio (DRR) of the ear signals were considered as objective metrics.

A. Listeners
Two groups of listeners participated in the present study.The normal-hearing group consisted of 12 listeners (8 males and 4 females) aged between 25 and 58 yr.All had audiometric pure-tone thresholds below 20 dB hearing level at frequencies between 125 Hz and 8 kHz.The hearing-impaired group consisted of 14 listeners (11 males and 3 females), aged between 62 and 80 yr.All had symmetrical sloping mild-to-moderatelysevere high-frequency sensorineural hearing loss, with a maximum difference of 15 dB between their left and right ear. Figure 1 shows the average pure-tone thresholds for the hearingimpaired listeners.Only 3 of the 14 hearing-impaired listeners used hearing-aids on a regular, daily basis.Two of the hearingimpaired listeners were excluded from further analysis since they perceived sounds that were presented diotically via headphones to be externalized, i.e., the sound was perceived as originating from outside of the head.Diotic signals are known to be internalized, i.e., perceived to be inside the head, by normalhearing listeners (e.g., Boyd et al., 2012;Catic et al., 2013).It was considered important in the present study, in terms of the reliability of the spatial perception data, that the recruited listeners consistently could differentiate between internalized and externalized sound images.All listeners signed an informed consent document and were reimbursed for their efforts.

B. Experimental setup and procedure
The experiments took place in a reverberant listening room designed in accordance with the IEC 268-13 (1985) standard.The room had a reverberation time T 30 of $500 ms, corresponding to a typical living room environment.Figure 2 shows the top view of the listening room and the experimental setup as placed in the room.The dimensions of the room were 752 cm Â 474 cm Â 276 cm (L Â W Â H).Twelve Dynaudio BM6 loudspeakers were placed in a circular arrangement with a radius of 150 cm, distributed with equal spacing of 30 deg on the circle.A chair with a headrest and a Dell s2240t touch screen (Round Rock, TX) in front of it were placed in the center of the loudspeaker ring.The listeners were seated on the chair with view direction on the loudspeaker placed at the azimuth angle of 0 deg.The chair was positioned at a distance of 400 cm from the wall on the left and 230 cm from the wall behind.
The graphical representation of the room and setup as illustrated in Fig. 2 was also shown on the touch screen, without the information regarding the room dimensions.Besides the loudspeakers, a Fireface UCX (RME Audio, Haimhausen, Germany) soundcard operating at 48 000 Hz, two DPA (Lillerød, Denmark) high sensitivity microphones, and a pair of HD800 Sennheiser (Wedemark, Germany) headphones were used to record the individual BRIRs for the listeners (see Sec. II C).The BRIRs were measured from the loudspeakers placed at the azimuth angles of 0, 30, 150 180, 240, and 300 deg.The listeners were instructed to support the back of their head on the headrest while remaining still and to fixate on a marking located straight ahead (0 ) both during the BRIR measurements and during the sound presentations.On the touch screen, the listeners were asked to place circles on the graphical representation as an indication of the perceived position and width of the sound image in the horizontal plane.By placing a finger on the touch screen, a small circle appeared on the screen with its center at the position of the finger.When moving the finger while still touching the screen, the circumference of the circle would follow the finger.When the desired size of the circle was reached, the finger was released from the screen.By touching the center of the circle and moving the finger while touching the screen, the position of the circle would follow along.By touching the circumference of the circle and moving the finger closer to or farther away from the center of the circle while touching the screen, the circle would decrease or increase in size, respectively.A double tap on the center of the circle would delete the circle.If the listeners perceived a split of any parts of the sound image, they were asked to place multiple circles reflecting the positions and widths of the split images.The listeners were instructed to ignore other perceptual attributes, such as sound coloration and loudness.
Each stimulus was presented three times from each of the six loudspeaker positions.This was done for each of the test conditions: Linear processing, independent compression, linked compression, and spatially ideal compression.No response feedback was provided to the listeners.The test conditions, stimuli and loudspeaker position were presented in random order within each run.

C. Spatialization
Individual BRIRs were measured to simulate the different conditions virtually over headphones.Individual BRIRs were used since it has been shown that the use of individual head-related transfer functions (HRTFs), the Fourier transformed head-related impulse responses, improve sound localization performance compared to non-individual HRTFs (e.g., Majdak et al., 2014), as a result of substantial crossfrequency differences between the individual listeners' HRTFs (Middlebrooks, 1999).Individual BRIRs were measured from the loudspeakers placed at the azimuth angles of 0, 30, 150 180, 240, and 300 deg.The BRIR measurements were performed as described in Hassager et al. (2016).The microphones were placed at the ear-canal entrances and were securely attached with strips of medical tape.A maximumlength-sequence (MLS) of order 13, with 32 repetitions played individually from each of the loudspeakers, was used to obtain the impulse response, h brir , representing the BRIR for the given loudspeaker.The headphones were placed on the listeners and corresponding headphone impulse responses, h hpir , were obtained by playing the same MLS from the headphones.To compensate for the headphone coloration, the inverse impulse response, h inv hpir , was calculated in the time domain using the Moore-Penrose pseudoinverse.By convolving the room impulse responses, h brir , with the inverse headphone impulse responses, h inv hpir , virtualization filters with the impulse responses, h virt , were created.Stimuli convolved with h virt and presented over the headphones produced the same auditory sensation in the ear-canal entrance as the stimuli presented by the loudspeaker from which the filter, h brir , had been recorded.Hence, a compressor operating on an acoustic signal convolved with h brir behaves as if it was implemented in a completely-in-canal hearing aid.
To validate the BRIRs, the stimuli were played in random order first from the loudspeakers and then via the headphones filtered by the virtual filters h virt .In this way, it could be tested if the same percept was obtained when using loudspeakers or headphones.By visual inspection, the graphical responses obtained with the headphone presentations were compared to the graphical responses obtained with the corresponding loudspeaker presentations.Apart from several front-back confusions (representing cone-of-confusion errors) in some of the listeners in the case of the headphone presentations, the graphical responses confirmed that all listeners had a very similar spatial perception in the two conditions.Generally, the response variability was found to be higher in the validation than in the actual experiment, especially for the elderly hearing-impaired listeners, which most likely was caused by the validation also serving as training in evaluating the auditory perception on the graphical user interface.

D. Experimental conditions
Two types of stimuli were considered to investigate the effect of the different compression schemes on spatial perception.A 1.6-s long clean speech sentence from the Danish hearing in noise test corpus (Danish HINT; Nielsen and Dau, 2011), and 4 s of ten noise bursts (transients) pairs, whereby each of the transients had a duration of 50 ms.Four conditions were tested: Independent compression, linked compression, spatially ideal compression, as well as linear processing, which served as a reference.The technical details of the DRC system will be described in Sec.II E. Figure 3 shows the block diagrams of the different conditions illustrating how the DRC systems were combined with the binaural impulse response that is represented by its left part, h brir;l , and its right part, h brir;r .In the independent compression scheme (top), the input signal, s in , was first convolved with h brir;l and h brir;r and then passed through two DRC systems operating independently in each ear.In the linked compression scheme (middle), after convolving with h brir;l and h brir;r as in the condition with the independent DRC systems, the signals were passed through a synchronized pair of DRC systems that, on a sample-by-sample basis in each of the seven frequency channels (Sec.II E), applied the lowest gain of the two level-dependent gain functions to both ears.In the spatially ideal compression scheme (bottom), the input signal, s in , was first passed through a single DRC system and the output was then convolved with h brir;l and h brir;r .The spatially ideal compression scheme thus consisted of a compression of the dry signal before the interaction with the room (i.e., the convolution with h brir;l and h brir;r ).In practice, since the dry signal is typically not available, such a system would require a deconvolution of h brir;l and h brir;r before compression, followed by a convolution with h brir;l and h brir;r to provide the listener with the spatial cues.
To create the signals for the condition with linear processing, the stimuli were convolved with h brir;l and h brir;r .To compensate for the effect of the headphones, the outputs s out;l and s out;r in all conditions were convolved with h inv hpir;l and h inv hpir;r , respectively, i.e., the left and right parts of h inv hpir .For the normal-hearing listeners, the sound pressure level (SPL) at the ear closest to the sound source was 65 dB in all conditions.For the hearing-impaired listeners, the headphone outputs were amplified with the NAL-R(P) linear gain prescription (Byrne et al., 1990) according to the listener's individual audiometric pure-tone thresholds to ensure audible high-frequency content.

E. DRC
To represent a modern multi-band hearing aid compressor, an octave-spaced seven-band DRC system was implemented.The incoming signal was windowed in time using a 512-sample long Hanning window (corresponding to a 10.7 ms time window at the sampling frequency of 48 000 Hz) with a frame-to-frame step size of 128 samples.Each of the windowed segments was padded with 256 zeros in the beginning and with 256 zeros at the end and transformed to the spectral domain using a 1024-sample fast Fourier transform (FFT).The power values of the resulting frequency bins were combined to seven octave-wide frequency bands with center frequencies ranging from 125 Hz to 8 kHz.The power in each band was smoothed using a peak detector [Eq.(8.1) in Kates, 2008].The attack and release time constants, measured according to IEC 60118-2 (1983), were 10 ms and 60 ms, respectively.The smoothed envelopes were converted to dB SPL.A broken-stick gain function (with a linear gain below the CT and a constant CR above the threshold) was applied to the processed power envelopes.The resulting band-wise gains were then smoothed in the frequency domain using a piecewise cubic interpolation to avoid aliasing artifacts.The frequency smoothed gains were applied to the bins of the short-time FIG. 3. Block diagrams of the three compression conditions: Independent compression (top), linked compression (middle), and spatially ideal compression (bottom).For the independent and linked compression schemes, the dry signal, s in , is convolved with the left and right BRIR, h brir;l and h brir;r , respectively, and then processed by the DRC system.In the case of linked compression, the arrow between the two DRC systems indicates that the DRC gain is synchronized between the left and the right ear.In the case of spatially ideal compression, the dry signal is processed by DRC and then convolved with the left and right BRIR.The output in the left-and right-ear channels in the different schemes are denoted as s out;l and s out;r , respectively.
Fourier transformed input stimulus, and an inverse FFT was applied to produce time segments of the compressed stimuli.These time segments were subsequently windowed with a tapered cosine window to avoid aliasing artifacts, and combined using an overlap-add method to provide the processed temporal waveform.The CTs and CRs were calculated from NAL-NL2 prescription targets (Keidser et al., 2011) for audiometric pure-tone thresholds corresponding to the average audiometric pure-tone thresholds of the hearingimpaired listeners.The CTs and CRs, as derived from the NAL-NL2 prescription, are summarized in Table I for the seven respective frequency bands.The simulated input level to the compressor operating closest to the sound source was 75 dB SPL.

F. Statistical analysis
The graphical responses provided a representation of the perceived sound image in the different conditions.To quantify deviations in the localization from the loudspeaker position across the different conditions, the root-mean-square (RMS) error of the Euclidean distance from the center of the circles to the loudspeakers was calculated.To reduce the confounding influence of front-back confusions as a result of the virtualization method, the responses placed in the opposite hemisphere (front versus rear) of the virtually playing loudspeaker were reflected across the interaural axis to the mirror symmetric position.
An analysis of variance (ANOVA) was run on fourfactor mixed-effect models to assess the effects of hearing impairment, compression condition, stimulus, and loudspeaker position on both the RMS error and the radius of the placed circles.The hearing status (normal hearing versus impaired hearing) was treated as a between-listener factor, and the compression condition, stimulus type (speech versus transients), and loudspeaker position were treated as withinlistener factors.The radius data were square-root transformed to correct for heterogeneity of variance.Tukey's Honestly Significantly Differences (HSD) corrected post hoc tests were conducted to test for main effects and interactions.A confidence level of 1% was considered to be statistically significant.

G. Analysis of spatial cues
In order to quantify the effect of the different compression schemes on the spatial cues, ICs and DRRs were calculated.To visualize the effect of compression on the relation between the direct and reverberant energy, "temporal energy patterns" were calculated, i.e., the energy of the processed signal as a function of time.

Interaural cues
The left-and right-ear output signals were filtered with an auditory inspired "peripheral" filterbank consisting of complex fourth-order gammatone filters with equivalent rectangular bandwidth spacing (Glasberg and Moore, 1990).The envelopes were calculated by taking the absolute values of the complex outputs of the different channels.The envelopes were windowed in time using a 20 ms rectangular window and an overlap of 50%.The power of the windowed segments was calculated and converted to dB SPL.The ILD histograms were subsequently computed by subtracting the level for the left ear from the level for the right ear for those time segments where both the left-and right-ear SPLs were above 0 dB SPL.The ILD distributions were estimated by applying a Gaussian kernel-smoothing window with a width of 0.9 dB on the ILD histogram.
The IC can be defined as the absolute maximum value of the normalized cross-correlation between the left and right ear output signals s out;l and s out;r occurring over an interval of jsj 1 ms (e.g., Blauert and Lindemann, 1986;Hartmann et al., 2005) For each individual listener, the left-and right-ear output signals were filtered with the auditory inspired "peripheral" filterbank.The ICs were subsequently computed from the filtered output signals.The just-noticeable difference (JND) in IC is about 0.04 for an IC equal to 1 and increases to 0.4 for an IC equal to 0 (Gabriel and Colburn, 1981;Pollack and Trittipoe, 1959).The IC distribution was estimated by applying a Gaussian kernel-smoothing window with a width of 0.02 (half of the smallest JND) on the IC histograms.

Temporal energy patterns
Temporal energy patterns were obtained from the bandpass filtered output signals.The temporal envelope was calculated by convolving the absolute value of the complex outputs with a 20 ms rectangular window.The power of the windowed segments was calculated for the left-and right-ear segments and converted to dB SPL.

DRR
The direct part of the BRIRs, h brir;dir , was defined as the first 2.5 ms of the impulse response, and the reverberant part, h brir;reverb ; was defined as the remaining subsequent samples of the BRIRs.The 2.5 ms transition point was chosen since the first reflection occurred immediately after this point in time.The reverberant part contained both the early reflections and the late reverberation.The gain values provided by the DRC systems in the processing of the left-and right-ear stimuli were extracted for each of the compression conditions.The impulse responses h brir;l and h brir;r (in Fig. 3) were 2.2:1 2.2:1 1.8:1 1.9:1 2.2:1 2.9:1 2.6:1 replaced by their direct parts h brir;dir;l and h brir;dir;r and the extracted gain values were applied such that the outputs s out;dir;l and s out;dir;r only contained the effect of the compression on the direct part of the signal.Correspondingly, the outputs s out;reverb;l and s out;reverb;r , representing the outputs that contained the effect of the compression on the reverberant part of the signal, were obtained by replacing the impulse responses h brir;l and h brir;r with their reverberant parts h brir;reverb;l and h brir;reverb;r .Besides the effect of the compression on the direct and reverberant part of the signal, the extracted gain values were applied on the time aligned dry signal such that the outputs s out;dry;l and s out;dry;r only contained the effect of the compression on the dry signal.
To estimate the effect of the different compression schemes on the reverberant content of the processed stimuli, the DRR was calculated for the left-and right-ear signals for the four conditions.For the compression conditions, the DRR was calculated in the frequency domain where S out;dir;k ðf Þ, S out;reverb;k ðf Þ, and S out;dry;k ðf Þ indicate the frequency-domain versions of the time signals s out;dir;k , s out;reverb;k , and s out;dry;k with respect to frequency w for k 2 ½l; r (left-and right-ear signal).For the linear processing condition, the DRR was calculated directly from the direct part (h brir;dir;l and h brir;dir;r ) and the reverberant part (h brir;reverb;l and h brir;reverb;r ) of the BRIR, respectively.DRRs were calculated for the frequency range from 100 Hz to 10 kHz.more than one circle on the touch screen, only the circle the listener placed nearest to the loudspeaker (including positions obtained by front-back confusions) was indicated in color, whereas the remaining locations were indicated in gray.

A. Experimental data
In the reference condition (upper left panel in Fig. 4), apart from some front-back confusions (i.e., errors on the cone of confusion), the sound was perceived as coming from the loudspeaker position at 300 azimuth.In contrast, in the independent compression condition (upper right panel), the sound was generally perceived as being wider and, in some cases, as occurring closer to the listener than the loudspeaker or between the loudspeakers at 240 and 300 azimuth.One of the listeners even internalized the speech stimulus.In some of the listeners, the independent compression also led to split images as indicated by the gray circles.In the linked compression condition (lower left panel), the sound images were reported to be scattered around and located between the loudspeakers at 240 and 300 azimuth, similar as in the condition with independent compression.Likewise, the sound images were indicated to be of larger width and were commonly perceived to be closer to the listener and not at the position of the loudspeaker.As in the condition with independent compression, the linked compression led to image splits and internalization in some of the listeners.Most of the listeners reported verbally that the sound image was more diffuse in the conditions with independent and linked compression than in the reference condition.Furthermore, in the independent and linked compression conditions, some of the listeners reported that they perceived part of the reverberation as enhanced and being located at a different place than the "main sound" leading to split images.In the spatially ideal compression condition (lower right panel), the listeners perceived the sound image as being compact and located mainly at the loudspeakers at 240 and 300 azimuth.None of the listeners experienced image splits in this condition.
In summary, in the normal-hearing listeners, independent and linked compression provided similar results.In both conditions, the results differed substantially from the results obtained in the condition with linear processing.In contrast, in the condition with the spatially ideal compression, similar results were observed as in the condition with linear processing.
Figure 5 shows the corresponding results for the hearing-impaired listeners.The general pattern of results across conditions was similar to that found for the normalhearing listeners (from Fig. 4).However, the hearingimpaired listeners typically perceived the sound images to be less compact than the normal-hearing listeners and the responses were characterized by a larger variability across listeners.For example, in the reference condition (upper left panel), the hearing-impaired listeners perceived the sound to be positioned at and around the loudspeakers at 240 , 270 , and 300 azimuth.Some of the listeners perceived the sound to occur between themselves and the loudspeakers while other listeners perceived the sound to be coming from beyond the loudspeakers.Both independent and linked compression (upper right and lower left panels of Fig. 5) caused wider and more spatially distributed sound images than in the reference condition whereas, in the case of ideally spatial compression (lower right panel), the sound was perceived to be more compact and similar to the sound presented in the reference condition.As observed for the normal-hearing listeners, some of the hearing-impaired listeners also experienced split images in the independent and linked compression conditions.Thus, overall, the hearing-impaired listeners typically showed a degraded spatial sensation relative to the normalhearing listeners, i.e., they experienced more diffuse and spatially distributed sound images.However, the hearingimpaired listeners showed similar effects of independent, linked, and spatially ideal compression on spatial perception as in the normal-hearing listeners.
The results obtained with the transients are shown in Fig. 6 for the normal-hearing listeners and Fig. 7 for the hearing-impaired listeners.The general pattern of results across conditions was similar to that observed for the speech stimulus, i.e., (i) the listeners' spatial perception was largely affected by both independent and linked compression, whereas spatially ideal compression provided similar results as in the reference conditions, and (ii) the hearing-impaired listeners indicated wider and more spatially distributed sound images than the normal-hearing listeners.However, in both listeners groups, the transients were generally perceived as more compact than speech, as indicated by the smaller circles in Figs. 6 and 7 compared to those in Figs. 4 and 5. Furthermore, more image splits were documented for the transients than for speech in the independent and linked compression conditions.
The overall pattern of results obtained in the other five loudspeaker positions (0 , 30 , 150 , 180 , and 240 azimuth) was similar to that observed for the loudspeaker positioned at 300 azimuth (Figs.4-7).For the radius of the placed circles, indicating the perceived width of the sound image, the ANOVA revealed an effect of compression condition [Fð3; 66Þ ¼ 61:54; p ( 0:001] and stimulus [Fð1; 22Þ ¼ 13:48; p ¼ 0:001] and loudspeaker position [Fð5; 110Þ ¼ 3:97; p ( 0:001].Post hoc comparisons confirmed that the listeners reported wider sound widths in the independent and the linked compression conditions than in the linear processing and spatially ideal compression conditions ½p ( 0:001.No differences between the independent and the linked compression conditions ½p ¼ 0:88, and between the linear processing and spatially ideal compression conditions ½p ¼ 0:11 were found.Furthermore, post hoc comparisons revealed that the indicated perceived sound width was similar for all combinations of loudspeaker positions, except between the loudspeakers positioned at 180 azimuth and 300 azimuth ½p ¼ 0:004.The post hoc estimated radius was higher for the speech than for the transients.For the RMS error, the ANOVA showed an effect of hearing status [ Fð1; 22Þ ¼ 7:07; p ¼ 0:01], compression condition [Fð3; 69Þ ¼ 7:52; p ( 0:001], and loudspeaker position [Fð5; 115Þ ¼ 3:92; p ¼ 0:003].Post hoc comparisons confirmed that the RMS error was higher in the independent compression and linked compression conditions than in the linear processing and spatially ideal compression conditions ½p ( 0:001.No differences between the independent and the linked compression conditions ½p ¼ 0:86, and between the linear processing and spatially ideal compression conditions ½p ¼ 0:99 were found.The post hoc estimated RMS error was higher for the hearing-impaired listeners than for the normal-hearing listeners.Furthermore, post hoc comparisons revealed that the estimated RMS error was higher for the lateral loudspeaker positions than for the loudspeaker positioned at 0 azimuth.For the reported image splits, no differences between the independent and the linked compression conditions ½p ¼ 0:91 was found in a mixedeffects logistic regression analysis.However, the regression analysis confirmed that there was a higher proportion of reported image splits in the trials with the transients than in the trials with the speech ½p ¼ 0:001.A significantly lower proportion of front-back confusions was obtained in the linear processing and spatially ideal compression conditions than in the independent and linked compression conditions [p < 0.05] according to a mixed-effects logistic regression analysis.The proportion of front-back confusions in the different conditions was 23.6% in the case of linear processing, 23.9% for the spatially ideal compression, 30.3% for independent compression, and 28.6% for linked compression, respectively.

B. Analysis of spatial cues
Figure 8 shows the ILD distributions for the speech (top panel) and the transients (lower panel) when virtualized from the loudspeaker positioned at 300 azimuth.For simplicity, only the results at the output of the gammatone filter tuned to 2000 Hz are shown, but many other frequency channels show similar characteristics.The red, green, light blue, and dark blue curves represent the ILD distributions for linear processing, independent compression, linked compression, and spatially ideal compression, respectively.For both stimuli, the ILDs are reduced in the independent compression condition (with a maximum at 1.5 dB) relative to the other processing conditions where the ILD statistics are similar to each other (and centered around 6 dB for the speech stimulus and 3 dB for the transients).The ILDs obtained for the transients are below those obtained for speech since the transients contain fewer time segments that are dominated by the direct sound and more segments dominated by reverberant sound energy compared to the speech stimulus.
Figure 9 shows the IC distributions for linear processing and the three compression conditions for the speech (upper panel) and the transients (lower panel) virtualized from the frontal loudspeaker.Again, for illustration, only the results at the output of the gammatone filter tuned to 2000 Hz are shown, but many other frequency channels show similar characteristics.The red, green, light blue, and dark blue curves represent the IC distributions for linear processing, independent compression, linked compression, and spatially ideal compression, respectively.For both stimuli, the IC distributions for linear processing and spatially ideal compression are similar to each other, and the distributions for independent and linked compression are similar to each other.The distributions obtained with linear processing and spatially ideal compression show their maxima at interaural correlations of about 0.92, both for the speech and the transients.In contrast, the maxima of the distributions for the independent and linked compression conditions are shifted toward lower values of about 0.87 in the case of speech stimulation and between 0.66 and 0.77 for the transients.The computation of the IC based on the temporal envelope instead of the temporal waveform revealed the same pattern of results across the four processing conditions.Thus, in the conditions with independent and linked compression, the interaural correlation of the stimuli was substantially decreased due to the compression-induced changes to the temporal envelope on each ear.
Figure 10 shows temporal energy patterns for the linear processing and the three compression conditions for the speech stimulus (upper panel) and the transient stimulus (lower panel) virtualized from the frontal loudspeaker.The energy patterns were computed from the stimulus presented to the right ear of one of the listeners.Again, for illustration, only the output of the gammatone filter tuned to 2000 Hz is shown.The red, green, light blue, and dark blue functions represent the results for linear processing, independent compression, linked compression, and spatially ideal compression, respectively.For dry stimuli, the effect of compression is reflected by the difference between the patterns obtained with spatially ideal compression versus linear processing.For the transient stimulus (bottom panel), the effect of compression is small due to the short duration of the transients relative to the time constants of the DRC system, while for the speech stimulus (upper panel) the effect of compression is more prominent as revealed by the reduced modulation depth in the temporal pattern.For reverberant stimuli, the effect of compression is reflected by the difference between the patterns obtained with independent and linked compression versus the pattern obtained with linear processing.For the transients (bottom panel), the reverberant decay rate is clearly reduced in the independent and linked compression conditions relative to the linear processing condition.The same can be observed for the speech (upper panel) at time instances where reverberation is dominating, e.g., at 0.38 s, 0.55 s, and 1.7 s.This indicates that these compression schemes increase the amount of reverberant energy relative to the direct sound energy.This is also reflected in the direct-to-reverberant ratios, which amount to 6.1 dB in the case of linear processing as well as spatially ideal compression (for this loudspeaker position).In contrast, the direct-toreverberant ratio reduces to 4.2 dB for the speech stimulus  The different colors represent the different processing conditions (red, linear processing; green, independent compression; light blue, linked compression; dark blue, spatially ideal compression).For better visualization of the trends, the functions have been displaced by 3 dB (spatially ideal compression), 6 dB (independent compression), and 9 dB (linked compression).and 0.2 dB for the transients both in the condition with independent and linked compression.This behavior is consistent with the different amounts of IC reduction observed in Fig. 9 for the two stimulus types.The reduced decay rate in the case of independent/linked compression is more prominent for the transients than for the speech stimulus since the effect of reverberation is partly "masked" by the ongoing speech stimulus.
Thus, both objective metrics (IC distributions and temporal energy patterns) show similar results for independent and linked compression.Furthermore, both metrics also show similar results for linear processing and ideal spatial compression.These patterns are consistent with the main observations in the behavioral data from Figs. 4-7.

IV. DISCUSSION
The spatial cue analysis showed that both independent and linked compression increased the energy of the reverberant sound relative to the direct sound.The reason for this is that the segments of the stimuli that are dominated by reverberation often exhibit a lower signal level and are therefore amplified more strongly than the stimulus segments that are dominated by the direct sound.Compared to the speech stimulus, the transients contained more segments that were dominated by reverberation.The enhanced reverberant energy was reflected by a similar decrease of the DRR as well as a similar change of the IC statistics for independent and linked compression relative to linear processing, particularly for the transient stimulus.Thus, in the reverberant environment considered in the present study, compression modifies the relation between the direct and reverberant sound energy which, in turn, affects the IC that underlie spatial perception.The decreased IC of the processed stimuli in the case of independent/linked compression was consistent with the higher proportion of image splits reported for the transients than for the speech stimulus and the perception of broader, more diffuse sound images as compared to linear processing.It has been demonstrated that listeners localize sound sources in reverberant environments by responding to the spatial cues carried by the direct sound and suppressing the spatial cues carried by the early reflections.This perceptual phenomenon has been termed "the precedence effect" (see Brown et al., 2015, for a review).In the present study, the early reflections were most likely not enhanced sufficiently by the independent and linked compression to overcome the precedence effect and thereby affect the listeners' perceived location of the stimuli, i.e., cause the image splits.Instead, the perceived split images might result from the enhancement of the late reverberation carrying spatial cues unrelated to the sound source.Thus, the results suggest that the energy ratio between the direct and the reverberation sound should ideally be preserved to provide the listener with undistorted cues for spatial perception.The reason why the split images were consistently perceived from the opposite hemisphere of the primary sound image in both the linked and independent compression condition is not clear from the analysis of the interaural cues used for localization.
The results are consistent with Blauert and Lindemann (1986) who demonstrated that a reduction in the IC results in both image splitting as well as a broadening of the sound image for normal-hearing listeners.However, in contrast to the findings of the present study, earlier studies (Whitmer et al., 2012(Whitmer et al., , 2014) ) found that hearing-impaired listeners were relatively insensitive to changes in IC, as measured by perceived width when using stationary noise stimuli.The different results might have been caused by the differences in the stimuli used in the present study and the ones of Whitmer et al. (2012Whitmer et al. ( , 2014)).In the present study, the reduction of the IC by compression was caused by changes to the binaural temporal envelope whereas in Whitmer et al. (2012Whitmer et al. ( , 2014) ) the change in IC was driven by changes in the binaural temporal fine structure, which is also the reason why the reported insensitivity was correlated with the ability to detect interaural phase differences (Whitmer et al., 2014).It has previously been shown that, in contrast to temporal fine structure sensitivity, the sensitivity to temporal envelope cues is similar in hearing-impaired listeners and normalhearing listeners (e.g., Moore and Glasberg, 2001).
The increased amount of front-back confusions in the independent and linked compression conditions suggests that these compression schemes distorted the monaural spectral cues (e.g., Middlebrooks and Green, 1991) that listeners in combination with head movement cues (Brimijoin et al., 2013) normally use to resolve forward from rearward sources.Thus, both independent and linked compression seem to make it more difficult for the listeners to distinguish between frontal and rearward sources.
In contrast to independent compression, linked compression is expected to restore the listener's natural spatial perception in anechoic environments due to the preservation of ILDs (Wiggins andSeeber, 2011, 2012).However, no effect of preserving the intrinsic ILDs by linked compression, as compared to independent compression, was found in the reverberant condition considered in the present study.Thus, the beneficial effect of preserving the ILDs is not apparent in reverberation, which most likely is a result of the dominating effect of fast-acting compression reducing the rate of the reverberant decay and, thereby, reducing the IC.Nonetheless, linked fast-acting compression has, in reverberant conditions, been shown to partly restore the ability to attend to a desired target in an auditory scene with spatially separated maskers, in contrast to independent compression (Schwartz and Shinn-Cunningham, 2013).However, the performance obtained with linked compression did not reach the level obtained with linear processing, potentially as a result of the reduced IC due to this compression scheme.It is possible that, based on the results of the present study, spatially ideal compression would produce similar results as linear processing since the spatial cues would be preserved.
It has been demonstrated that listeners can adapt to artificially produced changes of the spatial cues responsible for correct sound source location (for a review, see Mendonc ¸a, 2014).This plasticity in spatial hearing has been demonstrated both in the horizontal and vertical plane for various manipulations of the localization cues.For example, by modifying the direction-dependent spectral shaping of the outer ear by inserting ear molds in both of the listener's ears (Hofman et al., 1998) or only in one of the ears (Van Wanrooij and Van Opstal, 2005), listeners can reacquire accurate sound localization performance within a few weeks.It might be argued that such "remapping" processes also occur for other modifications of the acoustic cues, such as the ones considered in the present study.However, the signal-driven changes of the binaural cues considered here might be difficult to learn, since they affect the sound location, sound width, and give rise to image splits.Although the performance of sound localization can be reacquired, the increased sound width and image splits originating from the altered reverberation will most likely be difficult to remap as these are signal dependent and dynamic due to the characteristics of the fast-acting compression schemes.Consistent with this reasoning, it has been shown that not all modifications can be remapped.An example of this is ear swapping (Hofman et al., 2002;Young, 1928), where adaptation to switched binaural stimuli was not found for periods as long as 30 weeks.
Only the spatially ideal compression scheme, operating on the dry signal, provided the listeners with a similar spatial percept as the linear processing scheme.The processing did not distort the listeners' spatial perception in terms of source localization, at least not in the conditions considered in the present study.However, spatially ideal compression requires a priori knowledge of the BRIRs, which is not a feasible solution in realistic applications where the BRIR is unknown.Instead, a feasible approach could be to estimate the amount of reverberation in the stimulus, e.g., via an estimation of the DRR as a function of time, such that compression is only applied in moments where the DRR is above a certain criterion and otherwise switched off or reduced.Such a system might be particularly useful for hearing-instrument amplification strategies where the goal is to preserve the natural sound scene around the listener while still providing sufficient DRC restoring proper loudness cues.
In the present study, no ambient noise in the listening room was added to the input of any of the processing conditions.Typical everyday environments are likely to include some level of background noise that could influence the results since background noise will reduce the valleys of the temporal envelope of the sound.Thus, in such a condition, less amplification would be provided by the compression in the segments of the stimuli that exhibit a lower signal level than in the corresponding quiet situation, such that the reverberant portions of the stimulus would be enhanced less.Furthermore, the added background noise may perceptually mask some of the reverberation, decreasing the detrimental impact of compression on spatial perception.Hence, in everyday listening environments with ambient noise, the impact of compression on spatial perception might be less prominent than the effects reported in the present study.

V. CONCLUSIONS
This study investigated the effect of DRC in reverberant environments on spatial perception in normal-hearing and hearing-impaired listeners.The following was found: (i) Both independent and linked fast-acting compression resulted in more diffuse and broader sound images, internalization, and image splits relative to linear processing.(ii) No differences in terms of the amount of spatial distortions were observed between the linked and independent compression conditions.(iii) Spatially ideal compression provided the listeners with a spatial percept similar to that obtained with linear processing.(iv) More image splits were reported for the noise bursts than for speech both for independent and linked compression.(v) The spatial resolution of the hearing-impaired listeners was generally lower than that of the normalhearing listeners.However, the effects of the compression schemes on the listeners' spatial perception were similar for both groups.(vi) The stimulus-dependent distortion due to the linked and independent compression was shown to be a result of a reduced interaural-cross correlation of the ear signals as a result of enhanced reverberant energy.
Overall, the results suggest that preserving the ILDs by linking the left-and right-ear compression is not sufficient to restore the listener's natural spatial perception in reverberant environments relative to linear processing.Since spatial distortions were introduced via an enhancement of reverberant energy, it would be beneficial to develop compressor schemes that minimize the distortion of the energy ratio between the direct and the reverberant sound.

FIG. 1 .
FIG. 1. Audiometric pure-tone threshold averages for the right and left ear of the hearing-impaired listeners.The error bars represent one standard deviation of the thresholds.

FIG. 2 .
FIG. 2. The top view of the experimental setup.The loudspeaker positions are indicated by the black squares.The gray circle in the center indicates the position of the chair where the listener was seated.The listeners had a view direction on the loudspeaker placed at the 0 degree azimuth.The graphical representation was also shown on the touch screen, without the room dimensions shown in the figure.

Figure 4
Figure 4 shows a graphical representation of all normalhearing listeners' responses, including repetitions, obtained for speech virtualized from the loudspeaker positioned at 300 azimuth.The upper left panel represents the responses for the linear processing (the reference condition), whereas the responses obtained with independent compression, linked compression, and spatially ideal compression are shown in the upper right, lower left, and lower right panels, respectively.The responses of each individual listener in a given condition are indicated as transparent filled (colored and gray) circles with a center and size corresponding to the associated perceived sound image in the top-view perspective of the listening room (including the loudspeaker ring and the listening position in the center of the loudspeakers).Overlapping areas of circles obtained from different listeners are reflected by the increased cumulative intensity of the respective color code.To illustrate when a listener experienced a split in the sound image and, therefore, indicated FIG. 5. (Color online) Same as Fig. 4, but for the hearing-impaired listeners.
FIG. 6. (Color online) Same as Fig. 4, but for the normal-hearing listeners and transients.

FIG. 7
FIG. 7. (Color online) Same as Fig. 4, but for the hearing-impaired listeners and transients.
FIG. 9. (Color online) IC distributions of the ears signals, pooled across all listeners, at the output of the gammatone filter tuned to 2000 Hz.Results are shown for the speech (top) and the transients (bottom) virtualized from the frontal loudspeaker position.The red, green, light blue, and dark blue functions represent the IC distributions for linear processing, independent compression, linked compression, and spatially ideal compression, respectively.
FIG. 10. (Color online) Temporal energy patterns of the speech stimulus (top) and the transient stimulus (bottom) virtualized from the frontal loudspeaker position.Only the output of the signals processed by the gammatone filter at 2000 Hz is shown.The different colors represent the different processing conditions (red, linear processing; green, independent compression; light blue, linked compression; dark blue, spatially ideal compression).For better visualization of the trends, the functions have been displaced by 3 dB (spatially ideal compression), 6 dB (independent compression), and 9 dB (linked compression).

TABLE I .
The CTs and CRs in the seven octave frequency bands.