Perception of loudness and envelopment for different orchestral dynamics

One of the main factors that makes listening to live concerts emotionally engaging is the dynamic changes in the music. Recent research has shown that concert hall acoustics can affect the perception of orchestral dynamics, but hardly any listening tests have focused on this interplay between the dynamics in music and hall acoustics. In this study, the influence of the orchestral dynamics is assessed with auralizations of musical excerpts of different dynamics (from pianissimo to fortissimo) in two listening positions within four different concert halls. Pairwise comparisons were made in terms of loudness and envelopment which are among the main perceptual factors of concert hall acoustics. The subjective results show that the loudness and envelopment can depend on the musical dynamics. Therefore, in future research on concert hall acoustics, much more attention should be paid to the dynamically varying spectrum of the stimulus signal and the listening level, not only to linear impulse responses used for computing objective parameters. VC 2020 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1121/10.0002101 (Received 27 December 2019; revised 9 September 2020; accepted 11 September 2020; published online 16 October 2020) [Editor: Brian F. G. Katz] Pages: 2137–2145


ABSTRACT:
One of the main factors that makes listening to live concerts emotionally engaging is the dynamic changes in the music. Recent research has shown that concert hall acoustics can affect the perception of orchestral dynamics, but hardly any listening tests have focused on this interplay between the dynamics in music and hall acoustics. In this study, the influence of the orchestral dynamics is assessed with auralizations of musical excerpts of different dynamics (from pianissimo to fortissimo) in two listening positions within four different concert halls. Pairwise comparisons were made in terms of loudness and envelopment which are among the main perceptual factors of concert hall acoustics. The subjective results show that the loudness and envelopment can depend on the musical dynamics. Therefore, in future research on concert hall acoustics, much more attention should be paid to the dynamically varying spectrum of the stimulus signal and the listening level, not only to linear impulse responses used for computing objective parameters. One of the main reasons for attending live concerts is the desire to be emotionally moved and affected by the music (Roose, 2008). Strong emotional experiences induced by music are sometimes manifested and perceived as "chills" or "thrills." Among the main sources of such experiences are the dynamic changes in music, such as crescendos from softer to louder passages (Bannister, 2018;Sloboda, 1991). In concert halls, previous results have shown that the perception of dynamic changes is affected by the hall acoustics, especially by the absence or presence of strong lateral reflections (P€ atynen et al., 2014). When an instrument is played with different dynamic levels, the spectrum of the sound changes. For instance, when a brass instrument is blown with force, many more harmonics get excited and they are manifested in much more energy in the upper part of the spectrum compared to softer playing (Meyer, 2009). This effect applies to most instruments in a symphony orchestra, and consequently, not only the sound of individual instruments, but also the sound of the whole orchestra depends on the instantaneous dynamic level of the music. The study by P€ atynen et al. (2014) showed that due to the binaural/spatial aspects of human hearing, the acoustic design of a hall may or may not reinforce the dynamic changes in the music. These aforementioned results imply that the perception of room acoustics in different concert halls may depend on the dynamic level of the listened music, but this has not been focused on and is generally not taken into account in the subjective evaluation of concert halls. Subjective evaluations in the laboratory are often made with a comfortable listening level, although the proper way of conducting tests would require a correct reproduction level.
This article presents an experiment to evaluate how the dynamics of the music listened to influences the perception of sound in different concert halls in two listening positions in each hall. Comparisons between four different auralized concert halls are made with musical passages in varying orchestral dynamics (pianissimo, piano, forte, fortissimo) and the samples are listened to at the corresponding sound pressure levels. Orchestral dynamics means here that both the instrumentation, i.e., which instruments are in voice, and the playing level, i.e., how the instruments are played, are varying.
The most prominent perceptual factors related to listening to music at different levels in different concert halls are related to the loudness and the width of the source or the envelopment (P€ atynen and Lokki, 2016b). When an orchestra plays louder, in some halls the size of the orchestra seems to increase and the hall "wakes up" in fortissimo. Green and Kahle (2019) and Lokki and P€ atynen (2019) propose that this effect is due to the perception of the early reflections which has been shown to depend on the level and direction. Direct comparison of the perceptual differences between halls is hard and in an earlier study, we measured the joint impression, which was called impact (P€ atynen and Lokki, 2016a). Here, the aim is to study the level dependent loudness and envelopment differences by replicating the ordering of halls at different orchestral dynamics. The choice of loudness is also supported by the evidence that loudness (and timbre) has been shown to be associated with the perception of the dynamics of music performance (Fabiani and Friberg, 2011).
There are several earlier studies on the effect of the listening level related to concert hall acoustics. Keet (1968) combined the effect of the lateral sound and the sound level in his listening tests and found that the higher the listening level, the wider the sound of the source is perceived. Moreover, Kuhl (1978) made experiments by playing a recording of orchestral music at different levels in six concert halls and showed that the louder the music, the greater the spatial impression in the halls that provide distinct early lateral reflections, but not so much in other halls. However, both of these studies used the same source signal at different levels, thus they ignore the varying spectrum that depends on the instrumentation and on the played dynamics (Meyer, 2009).
In this article, we report the results of a listening test in which subjects compared auralized concert halls at different listening levels and ordered the halls according to the perceived loudness and envelopment. The purpose of the study was to investigate the effect of the level dependent variables in the source signals and listening levels while the impulse responses of the halls remain the same. If the order of the halls changes then we could assume that halls are perceived differently in varying orchestral dynamics.

II. METHODS AND MATERIALS
The experiment was performed in the laboratory with high quality auralizations of concert halls. The concert halls were measured with the loudspeaker orchestra and auralized with the 3D loudspeaker setup of 45 loudspeakers. Two listening positions within four halls were studied and the stimuli were the same for all of the halls. The following describes the concert halls, listening positions, music excerpts, auralization method, and the implementation of the listening test in detail.

A. Concert halls and receiver positions
The concert halls used in the experiment were Berlin Konzerthaus (BK), Berlin Philharmonie (BP), Helsinki Musiikkitalo (HM), and Munich Herkulessaal (MH). Figure 1 shows their plans, sections, volumes and seat capacity. In all halls, two equidistant listening positions were used. These positions were "FRONT" at 11 m from the loudspeaker orchestra and "BACK" at 19 m from the loudspeaker orchestra. shown parameters might differ from other measurements, as our sources were not omnidirectional sources, as defined in the ISO3382-1:2009 standard. All in all, the differences between the halls are of interest here and it should be emphasised, that these objective room acoustical parameters are level independent, as they are based on measured impulse responses.

B. Music excerpts
When musicians play at different dynamics there are measurable changes in timbre, in particular, in the spectral skewness (Weinzierl et al., 2018), due to the fact that the individual orchestral instruments have their own level dependent spectra (Luce, 1975). Therefore, the anechoic music for auralization should be recorded in different dynamics. We selected from anechoic orchestral recordings (P€ atynen et al., 2008) passages where the overall playing level ranges from pianissomo to fortissimo and that within a selected passage the level would not vary too much. Therefore, we ended up selecting musical excerpts from different composers. To find out the suitable excerpts for the different dynamics, the entire piece was chopped to one second frames with 50% overlap. The L Aeq values for each frame were computed and when there was almost a stable value for over 6 s, the segment was looked at more carefully. This way, four excerpts for different musical dynamics were found and they were as follows: • Pianissimo (pp) L. van  The instrumentation is different in each excerpt, which is typical for different orchestral dynamics. Therefore, the chosen excerpts are considered to represent well the natural conditions in concert halls and the stimuli have a typical average spectrum for each dynamic. To illustrate the spectral content of the excerpts, the sum of all of the instrument tracks were created and the resulting spectra of the excerpts are depicted in Fig. 3. It can be seen that the spectral differences are prominent in particular at low frequencies below 200 Hz and at high frequencies from 2 to 20 kHz.

C. Auralization technique and the anechoic listening room
The applied auralization technique was the same as in our previous work (Lokki et al., 2016;P€ atynen and Lokki, 2016b). The halls were measured with the loudspeaker orchestra (P€ atynen, 2011), which consists of 33 loudspeakers connected through 24 channels on the stage of a concert hall (see Fig. 1). The spatial impulse responses from each of the source channels to each receiver position were measured with an open six-microphone array. All six impulse responses are harnessed for spatial analysis with the spatial decomposition method (SDM) (Tervo, 2018;Tervo et al., 2013) that exploit the time-difference-ofarrival between all microphone pairs to estimate the direction of incidence for each sample in an impulse response. In auralization, based on the spatial metadata, the impulse response captured with one omnidirectional microphone is distributed to the reproduction loudspeakers around the listener and the excitation signal is convolved with all of the reproduction loudspeakers. In this study the listening setup consisted of 45 loudspeakers (Ones 8331A, Genelec, Finland) in a 3-D setup, which means that one impulse response from one source channel is divided into 45 reproduction channels and then all of these are convolved with the corresponding anechoic instrument group. The whole process is replicated to all 24 source channels. The final result of the entire process is a 45-channel wav file (48 kHz sampling rate, 24 bits) for each receiver position in each hall.
The listening room with the 45-channel setup is anechoic down to 50 Hz. The loudspeakers are positioned around the listener at the distance of 2 m. The number of loudspeakers at different elevations are 4 at 660 elevation, 8 at 630 elevation, 18 at ear level (more dense in frontal directions), 2 at 615 elevation in front, and one directly above the listener. The loudspeakers are calibrated according to the manufacturer's recommendation using their proprietary software (GLM 3, Genelec, Finland). The whole calibration process includes compensation for small differences in the physical distances between the loudspeakers and the listening position as well as automatic adjustments of the levels and the frequency responses of the loudspeakers. In the anechoic listening room the measured background noise was L Aeq ¼ À2.1 dB, but when the loudspeakers were turned on the background noise level was L Aeq ¼ 11.6 dB.

Listening levels
As described above, the listening room was an anechoic room with an extremely low background noise level (L Aeq ¼ 11.6 dB). The main point for these experiments was to compare concert halls at different listening levels and therefore the playback levels were double checked with a B&K 2250 sound level meter (class I) in the position of a listener's head. The measured A-weighted sound pressure levels L Aeq are listed in Table I for all music excerpts. The natural level differences between the halls were kept untouched and therefore small variation between the halls exists. Table I lists also loudness in sones according to the binaural auditory model defined in ISO standard 532-2 (Moore & Glasberg's model). The model, implemented in Matlab 2020a, was developed originally for stationary signals. The applied signals are time varying, but still the model gives reasonable estimates in sones for comparison. It should be noted that sones are not necessarily correct absolute values, as we did not manage to calibrate the signals exactly for loudness computation, but the relative values between the halls at both seats are comparable.

D. Participants and listening test design
The participants of the listening tests were gathered among the personnel of the Acoustics Lab at Aalto University. None of them reported any hearing problems and they could be considered as experienced listeners, because they all have participated in several listening tests before. In total, 20 listeners (3 females) completed the listening tests.
The listening tests were carried out using a twoalternative forced choice (2AFC) paradigm to answer two questions: "Which one is louder?" and "Which one is more  enveloping?" Each sample pair included two halls in one listening position and orchestral dynamics, i.e., no comparisons were made across different positions or dynamics. This resulted in 48 pairs per participant, as four halls form six pairs and they were all compared at four dynamic levels in two listening positions (6 Â 4 Â 2 ¼ 48). The 48 pairs were evaluated in a fully randomised order and half of the participants evaluated loudness first and the other half did envelopment first. The participants were free to switch quickly between the two music samples where the position in the sample was continuous. Otherwise, they could not modify the playback in any other means (e.g., looping or listening level). The participants entered the choices via a small hand-held tablet and they were instructed to look forward (head rotations were allowed) when listening to the samples. Before the actual test, the participants practised answering and familiarize themselves with the samples with a few sample pairs.

E. Data analysis methods
The first step in the data analysis was to compute the total number of times each hall had been chosen over the other halls. Then, the data was subjected to a binomial logistic regression analysis via a generalized linear model (GLM) using a binomial "logit" link-function. The analysis was done in R with the package "lme4" (Bates et al., 2015). The dependent (i.e., response) variable was the compound of the number of times a particular hall was chosen and not chosen (i.e., "success" and "no success") in the pairwise comparisons with the other halls. The independent fixed effect variables were Halls (HALL) and Dynamics (DYN) both with four factor levels and Seat (POS) with two factor levels.
Given that comparisons between the halls were performed in different conditions (DYN-POS) by the same subjects, the subjects were included in the regression model as a random effect. The analysis of deviance was used to investigate the statistical significance of the independent variables. This analysis is similar to the traditional analysis of variance, and the results indicate whether or not including a variable in the model yields a significant reduction in the deviance of the residuals when compared to a null model. "Type III" test statistics are reported, because it also takes into account the influence of other variables in the calculations.
In order to conduct a post hoc analysis of the pairwise differences between the halls, each of the separate datasets of DYN-POS were fitted with the GLM model where subjects were included as a random effect. The confidence intervals, illustrated in Fig. 4, were derived from the model estimates by using the "Wald" method. The presented confidence intervals are in line with the post hoc comparison between the halls using the least squares means with Bonferroni adjusted p-values and significance level of 95%. Note that all values and confidence intervals in the figures are converted from the log-odds to probabilities.

A. Loudness
The results of the listening tests (20 listeners) for loudness is plotted in Fig. 4(A). Analysis of the deviance results for loudness are tabulated in Table II. The results show that the main effect of the HALL is significant in all cases, as expected, but both the DYN and POS main effects are insignificant (DYN p ¼ 0.65, POS p ¼ 0.47). This result was expected considering that the comparisons between halls were made for each dynamic and position. All the interactions (HALL:DYN p < 0.001; HALL:POS p < 0.001; and HALL:DYN:POS p ¼ 0.008) are also significant for indicating that the number of times a particular hall was chosen over the other ones was influenced by the dynamics of the music as well as the position in the halls.
In order to analyse the effect of dynamics in more detail, and because the comparisons were not made between positions, the data for FRONT and BACK positions are analysed separately. These results are also included in Table II and naturally the interaction between dynamics and position is omitted because it is not sensible due to the design of the experiment. Analysing the FRONT and BACK positions separately shows that the interaction between hall and dynamics (HALL:DYN) is significant in both positions, but somewhat stronger in the FRONT than in the BACK, as can be seen in Fig. 4(A).
The results show that the order of the halls hardly changes for different dynamics. However, at the FRONT position in pianissimo hall MH is significantly the loudest while in other dynamics BK seems to be louder, although the difference with MH is barely significant only in fortissimo. The distance between halls is somewhat different for different dynamics and at the FRONT position, HM is not significantly quieter than BK in pp and MH in f. At the BACK position, MH and BK do not differ from each other, but the difference between HM and BP is significant, except in p.

B. Envelopment
The envelopment results are plotted in Fig. 4(B). The analysis of the deviance results for envelopment are tabulated in Table III and they follow very much the same lines as the results for loudness discussed above. The dynamics' and position's main effect are not significant as expected. The interaction between hall and dynamics is significant in the FRONT but not in the BACK position, indicating that dynamics had a stronger influence to the results in the FRONT than in the BACK. Figure 4(B) reveals interestingly that hall BK is significantly more enveloping in fortissimo in the FRONT than the other halls. With other dynamics, BK does not differ from MH significantly. Hall HM also behaves interestingly compared to BP. At the FRONT position it is significantly more enveloping in pp and f, but in ff the order of the halls is switched. At the BACK position BP is significantly more enveloping than HM, but not when the orchestra is playing in piano.

C. Verbal feedback from subjects
The subjects said that they were confident in their evaluation of both loudness and envelopment. Furthermore, they indicated that the differences in envelopment were considered more easy to compare. Many subjects commented that some loudness judgements were challenging as some samples had more frontal sound than surround sound and judging the loudness between such cases was not trivial. Moreover, some subjects commented that in the loud samples they concentrated more on the brass instruments whereas some others said that they tried to listen to the entire orchestra.

IV. DISCUSSION
The overall result shows that rectangular shaped smaller concert halls (BK and MH, see Fig. 1) are perceived to be much louder and enveloping than the larger halls with raked floor (BP and HM) for both positions. This result is obvious and is predicted already by the objective parameters (Fig. 2). However, interesting interactions within hall types were found and therefore the further discussion is mainly concentrating on the interaction of halls and dynamics in these pairs of halls. Different orchestral dynamics segregate the halls differently. Pianissimo signal contains only viola and woodwinds and it seems to render relatively large differences within hall pairs. In contrast, the piano excerpt having only strings does not make any difference within hall pairs at any position for both loudness and envelopment. The largest differences within hall pairs are obtained when the full orchestra is fortissimo, except for the loudness at the FRONT position, where much larger differences occur in forte.

A. Loudness
The objective Strength values (Fig. 2) propose that at low and high frequencies, BK and MH would be much louder than the other halls. However, the FRONT of hall HM has practically the same G at mid frequencies than BK and MH.
The perceptual results showed that hall HM is not as loud as BK and MH, but in pp and f, the difference is not so large. It is also interesting to compare the perceptual loudness results with the measured L Aeq levels in the listening room and the computational loudness values (Table I). The L Aeq values are not in agreement at the FRONT position in all dynamics, but they predict the order of the halls quite well at the BACK position. The binaural loudness values according to ISO 532-2 predict the listening test results better, although the order of the halls is not correct in all dynamics.
When halls are looked at in pairs (BK-MH and BP-HM) some dynamic dependent results can be seen. The measured G values (Fig. 2) for BK and MH are almost identical at all octave bands. Only at the BACK position does BK have slightly more energy for low and high frequencies. However, the loudness results [ Fig. 4(A)] reveal that MH is significantly louder in the FRONT for pp when only the woodwinds and the viola are playing. At the BACK position, the situation is more equal, although for f, MH is perceived as louder (with a significant difference) than BK while the G values are lower at all octave bands.
Between halls BP and HM the listening test results for loudness are clear. The FRONT position of HM is significantly louder than BP for pp and f. The BACK position of BP is perceived as significantly louder for all other dynamics than for p. This is contrary to the measured G values (Fig. 2). One possible explanation for this results from the fact that BP has weaker early sound, but a stronger lateral sound field after 80ms (indicated by the lower G, but higher L J ) and therefore hall BP is perceived to render louder sound.

B. Envelopment
For envelopment, the objective L J values suggest that smaller rectangular halls always have better envelopment then larger, more steeply raked halls, which is a well-known fact (Long, 2009). Therefore, rectangular room shapes have more envelopment, which is clearly seen also in the achieved results. The objective L J values (Fig. 2) are practically the same between halls BK and MH at all octave bands. However, hall BK has significantly more envelopment for ff in the FRONT position. One possible explanation for these results is that Hall BK has a larger volume and a greater ceiling height, resulting in a bigger time gap between the arrival of the early lateral reflections and the first wave fronts from the ceiling, as shown in Fig. 5 with spatiotemporal visualizations of cumulative energy (P€ atynen et al., 2013). This is the most pronounced difference between halls and maybe affects the sense of envelopment, when the orchestra is playing loud. Interestingly, the obtained result is Maybe the most intriguing envelopment result of this study is between BP and HM at the FRONT position. The L J values suggest that HM should have a larger envelopment, which is the case for pp and for f. But, for ff, hall BP is found to be significantly more enveloping than HM. The objective G values show that the high frequencies are strongly attenuated in HM and this might have affected the sensation of envelopment with music that has a stronger high frequency component (as in fortissimo playing). Another possible explanation for the subjective results can be seen in Fig. 6. First, in BP, strong early reflections behind the seat occur, and they might be audible only for fortissimo. Second, even though HM has more enveloping reverberation (rounder cumulative energy, i.e., a larger L J ), BP has a few distinct lateral reflections from small wall elements between "terrasses." Green and Kahle (2019) proposed that such reflections could become audible at higher listening levels and as hall HM does not have any such distinct reflections, this could be one reason why the envelopment is lost (compared to BP) for ff. The envelopment result is also in line with our earlier study on hall HM (P€ atynen and Lokki, 2015).

V. CONCLUSIONS
Concert hall acoustics was studied using listening tests that compared four concert halls at two listening positions according to the perceived loudness and envelopment. The pairwise comparison of the halls was performed in four different orchestral dynamics so that both the music and the listening level corresponded to the natural orchestral dynamics. In other words, the pianissimo passage of the music was listened to at less than L Aeq ¼ 60 dB, while the full orchestra fortissimo samples reached L Aeq ¼ 82 dB at the listening positions. The results show that the smaller rectangular halls render music both louder and more enveloping in all dynamics, but close to the orchestra, the difference for the other hall types is not so large. Interestingly, in a few cases, the change in the listening level changed the order of the halls, both for loudness and envelopment. Such an interaction cannot be predicted from objective room acoustical parameters that are computed from measured impulse responses. Overall, the results of these studies suggest that perceptual evaluations of concert hall acoustics require stimuli in different orchestral dynamics to gather a complete picture of the differences between halls.