Adjoint-based optimization of sound reinforcement including non-uniform flow.

The determination of optimal geometric arrangements and electronic drives of loudspeaker arrays in sound reinforcement applications is an ill-posed inverse problem. This paper introduces an innovative method to determine complex driving functions, also considering complex environmental conditions. As an alternative to common frequency domain methods, the authors present an adjoint-based approach in the time domain: Acoustic sources are optimized in order to generate a given target sound field. Instead of the Helmholtz equation, the full non-linear Euler equations are considered. This enables an easier treatment of non-uniform flow and boundary conditions. As proof of concept, a circular and a linear monopole array are examined. For the latter, the environmental conditions include wind and thermal stratification. For all examples, the method is able to provide appropriate driving functions.


I. INTRODUCTION
Sound reinforcement aims at synthesizing specified sound fields for the entire audio bandwidth.The generation of the target sound fields can be controlled by the placement and the electronic drive of different loudspeakers.Gain and delay are the two parameters-sometimes called excitation coefficients or feeding coefficients-that have to be ascertained for electronic optimization of the loudspeaker array radiation.Both parameters are related to the amplitude and phase of a complex driving function.They are typically computed separately for each frequency (Betlehem and Withers, 2012;Coleman et al., 2014;Feistel et al., 2013;Straube et al., 2018;Terrell and Sandler, 2010;Thompson, 2009;van Beuningen and Start, 2000) using frequency domain methods.
This article presents a new method to find loudspeaker positions and their driving functions for application in sound reinforcement.The focus of this work is the determination of optimized driving functions-also in case of non-uniform flow.
For this purpose, the authors present an adjoint-based method in the time domain.It is based on the full non-linear Euler equations and the corresponding adjoint, which are solved by means of computational aeroacoustic (CAA) techniques.Since the Euler equations allow us to consider wind and thermal stratification, applications of the current work may include stadiums or open-air events (Fig. 1).Here, they are considered in a model form.
In fluid mechanics, the use of the adjoint has proven to be an effective approach for determining various model parameters (Giles and Pierce, 2000;Jameson, 1995).Utilizing corresponding methods for acoustic problems is convenient as the adjoint compressible Euler and Navier-Stokes equations (Lemke et al., 2014;Lemke and Sesterhenn, 2016) also capture acoustics, both for a stationary environment and complex base flow.
The paper is organized as follows: In Sec.II the adjointbased method is presented and tailored for the application in sound reinforcement.Section III introduces the concept of validation and methods.Section IV presents exemplary results to verify the applicability of the adjoint-based method for circular and linear source arrays-for the latter with a velocity and a temperature profile.Section V gives a summary.

II. ADJOINT APPROACH
In contrast to frequency domain approaches, which are common in sound field generation as well as sound reinforcement and which are based on an integral representation of the homogeneous wave equation for discrete source distributions (Feistel et al., 2009;Meyer, 1984;Meyer and Schwenke, 2003;van Beuningen and Start, 2000), the adjoint-based method makes use of a more general representation of the wave propagation in the time domain.

A. Adjoint equations
The adjoint equations can be defined in a continuous or discrete manner.For the sake of simplicity, they are introduced in discrete version (Giles and Pierce, 2000).A matrixvector notation is used.The vector space is the full solution in space and time.This section is based on Lemke (2015).
Adjoint equations arise by a so-called objective function J, which is defined by the product between a geometric weight g and the system state q: J ¼ g T q; g; q 2 R n : (1) a) Electronic mail: lewin.stein@tu-berlin.de The system state q corresponds to the solution of the governing system with A as governing operator and s as source terms on the right-hand side.In terms of optimization of J by means of s, the equation has to be solved for every different s.In order to reduce the computational effort, the adjoint equation can be used, with the adjoint variable q*.By an expression is found, which enables the computation of J without the need to solve the system for every discrete s.
After solving the adjoint equation, the objective can be determined by a simple and computationally cheap scalar product.Thus, the adjoint approach enables the efficient computation of gradients for J with respect to s.

B. Objective and adjoint-based gradient
According to the intended application-the sound field generation in the time domain-the objective function J is defined in space and time with dX ¼ dx i dt: The variable q ¼ ½.; u j ; p comprises all quantities which are necessary for a state description of the governing system, defined by the Euler equations.The variable q target denotes the desired system state, .the density, u j the velocity in the direction x j and p the pressure, which is used solely for evaluation of the objective The additional weight rðx i ; tÞ defines where and when the objective is evaluated.In practice, the objective function is supplemented by additional constraints, e.g., power consumption, and a regularization term (Bewley, 2001), which are omitted here for the sake of clarity.Optimal sound reinforcement is realized if J reaches a minimum.
The minimum is to be achieved under the constraint that the Euler equations E are satisfied.Similar to Eq. ( 2), the following system is introduced: abbreviating the Euler equations with c as heat capacity ratio.The summation convention applies.See Lemke et al. (2014) for details on the formulation.
The terms s ¼ ½s .; s .uj ; s p on the right side of the equations characterize sources for mass, momentum, and energy.They enable control of the system state, respectively, the solution of the equations.The general goal is to obtain a solution of the Euler equations, which matches q target , respectively, p target in an optimal sense, by adapting s.The sources s can be interpreted as sound sources or loudspeakers.Monopole sources can be described solely by sources s p in the pressure equation.Thus, an optimization of s corresponds to an optimization of the loudspeakers' output signals.
In order to use the adjoint approach for optimizing s, it is necessary to linearize the objective function Eq. ( 6) and the governing system Eq.( 8).This results in and with the now defined weight g ¼ ðp À p target Þr.Combining the linearized system and the objective in a Lagrangian manner by introducing the (adjoint) multiplier q*, leads to For the sake of simplicity, the integrals are not shown.The adjoint equations result in with q Ã ¼ ½.Ã ; u Ã j ; p Ã , similar to Eq. (3).See Lemke (2015, p. 19) for a detailed derivation of the adjoint Euler equations, which are given by with Ã ¼ ðA T Þ À1 and Ci as resorting abbreviated as dq j Ci jb @ x i c b .The matrices A, B i , and C i result from the linearization of the Euler Eq. ( 8) and are given in the Appendix.
The change of the objective function reads dJ ¼ q ÃT ds; (15) similar to Eq. (4).Thus, the adjoint solution can be interpreted as a gradient of J with respect to the source terms s, Initial and boundary conditions of the adjoint Euler equations as well as the derivation of the adjoint compressible Navier-Stokes equations are discussed in Lemke (2015).

C. Iterative determination of the driving function
The adjoint-based gradient is used in an iterative manner.First, the Euler equations are solved forward in time with s 0 ¼ 0. Subsequently, the adjoint equations are calculated backward in time deploying the direct solution and the weight g.Based on the adjoint solution, the gradient r s J is determined and used to update the source distribution s n : with a denoting an appropriate step size and n the iteration number.The gradient is calculated for the whole computing region and the entire simulation time, but only evaluated at desired loudspeaker positions.While convergence is not accomplished in the objective function, the process is repeated with the current s n .Please note that only s p is optimized in this work.Thus, only monopole sources are considered.The resulting optimal s is deconvolved with the considered primitive input signal and the loudspeaker properties resulting in an optimal driving for every considered loudspeaker.See Fig. 3 for an overview of the process.
The proposed technique optimizes towards local extrema.Detecting global optima is not ensured.The computational costs of the adjoint-based approach are independent of the number of loudspeakers and their arrangement.They only depend on the size of the computational domain, which includes the entire space from the loudspeakers to the most distant listening position, and the considered frequency range, as higher frequencies demand a higher spatial and temporal resolution.Using high-accuracy discretization schemes, e.g., compact schemes, a resolution of four to 12 points per wavelength will usually be sufficient, depending on the required accuracy.
Figure 2 shows the convergence behavior of the objective functions of all optimization runs conducted in the following.Throughout this work, all CAA solutions are the result of 15 iteration steps.Depending on the line array setup and the type of background flow the objective is reduced between one and two orders of magnitude.
Utilizing a line search strategy with three forward and one adjoint backward calculation per iteration step (quadratic interpolation), the total CPU time after 15 iterations adds up to roughly 200 CPU-hours (details of the employed numerical resolution follow in Sec.III B).On a local machine with 16 cores (AMD EPYC 7351, 32 GB RAM each), these hours correspond to a total runtime of approximately half a day.

III. VALIDATION SETUP A. Introduction of the validation concept
As validation, the adjoint-based method (Sec.II) is applied to different test cases (Secs.IV A, IV B, IV C).To obtain a reference solution for all test cases, i.e., a sound field generated by prescribed speakers with fixed positions for given input signals, we use a complex directivity point source (CDPS) model.In the frequency domain, the CDPS model (Feistel et al., 2009;Meyer, 1984;Meyer and Schwenke, 2003;van Beuningen and Start, 2000) calculates the analytic sound field in free space without any background flow, by superimposing the impact of all speakers (for its setup, see Sec.III C).
The basic validation procedure is as follows: First, we specify a test case consisting of circularly or linearly aligned speakers and an input signal.Throughout this paper, all speakers are assumed as monopoles.For the exact speaker locations please refer to Figs. 6 or 12.As favorable input signal-to analyze an entire frequency range-we select a sine sweep with an exponential frequency increase over time (Fig. 4) in all cases.Below 1.3 kHz and above 2.7 kHz the sweep has a fade in and fade out period.At the end of the sweep there is 1 ms silence.The exponential sine sweep is chosen as a popular signal for loudspeaker measurements (Mueller and Massarani, 2001), though other input signals could be also selected.Here, always the same sweep signal is prescribed for all speakers.However, to impose a random variation between the speakers, the amplitudes and the phases among all utilized speakers are varied.
Second, we determine the entire four-dimensional (three spatial and one temporal dimension) reference sound field p 0 ref for the specified speakers and input signals with the CDPS model (Sec.III C).To convert the sound fields from frequency to temporal domain, we perform a discrete Fourier transform.Snapshots of the reference fields are visualized by Figs. 7 and 12. Third, we utilize the adjoint CAA solver (for its numerical setup see Sec.III B) to independently recalculate the input signals for each speaker.As input for the adjoint CAA solver the speaker locations and the four-dimensional reference field p 0 ref , which is used as the optimization target (p target ¼ p 0 ref þ p 1 ) within the spatial mask r [Eq.( 6)], are provided.The ambient pressure is denoted by p 1 .
Fourth, the quality of the numerically identified speaker signals is evaluated by contrasting them against the original input signals of the CDPS model (reference).As further validation, the numerically reproduced (optimized) sound field p 0 opt ¼ p opt À p 1 is compared with the reference p 0 ref at representative microphone locations.
Taken together, we check to what extent the adjoint approach is able to identify the optimal excitation signals for a given sound field with a given speaker alignment.
To keep the discussion clear of special features (e.g., initial transient) of adjoint-based optimization, the startup and the final decay process are excluded from the following discussion.We thus restrict the validation to a frequency interval of the sweep between 1.3 and 2.7 kHz (see vertical black lines of Figs. 5 or 8).If needed, one can modify the FIG. 3. Iterative process for determining optimal driving functions: Starting with an initial guess for the loudspeakers s p ¼ 0, the governing equations are solved forward in time.The result is compared to the desired reference state, while the actual difference drives the adjoint equations, which are solved backward in time.Based on the adjoint solution, the gradient for r sp J is used to update the actual forcing.Once convergence is reached, deconvolution of optimal s with the predefined input signal and the speaker characteristics results in optimal drives.Computationally intensive steps including CAA methods are marked in gray.

B. Setup of the adjoint-based CAA solver in time domain
The governing equations Eq. ( 8) are discretized by finite differences.A sixth-order accurate compact symmetric derivative stencil is used (Lele, 1992).The sound reinforcement area under consideration is 0 < x i < 1:6 m for all directions and is discretized by means of a uniform grid resolved by 197 Â 197 Â 99 grid positions.This relatively small spatial domain does not reflect a real outdoor sound reinforcement situation.However, it enables us to complete an optimization run with 15 iteration steps within the above mentioned computational time.At the same time all physical relevant phenomena such as an inhomogeneous sound pressure level (SPL) distribution, a wide frequency spectrum, and a stratified background flow can be fully studied.
Large scale computations can be realized using a parallel cluster as the problem is fully parallelizable.The problem scales with the number of discretization positions used to resolve space and time.The computational complexity remains the same.For larger setups stability problems are not expected.Thus, for a large-scale venue slice with 70 Â 15 m 2 , the required amount of CPUh will increase to about 45 000 CPUh per loop, which can be handled on a parallel cluster in reasonable time and with reasonable costs.
The physical time span t 0 ¼ 0 ms to t end ¼ 26 ms is separated into 1250 time steps, corresponding to a sampling rate of 48 kHz and a maximal Courant-Friedrichs-Lewy condition (Wendt, 2009) equal to 0.94.An explicit fourth-order Runge-Kutta scheme (Wendt, 2009) is employed for timewise integration.All boundaries are treated as non-reflective using so-called "characteristic boundary conditions" (Poinsot and Lele, 1992;Stein, 2019).In order to suppress reflections, a quadratic sponge layer is added to all boundaries, acting on a side margin with a width of 0.3 m (Mani, 2012).To ensure stability, an implicit filter of sixth order is used at each time step (Gaitonde and Visbal, 2000).The adjoint equations are solved using the same discretization.The corresponding boundary conditions are discussed in Lemke (2015).
To realize an ambient speed of sound c 1 ¼ 343 m/s of an ideal gas with the adiabatic index c ¼ 1.4, the ambient values for the density and the pressure are set to . 1 ¼ 1.21 kg/m 3 and p 1 ¼ 101 325 Pa, respectively, with R s ¼ 287 the specific gas constant of air.Please note that for the case with nonhomogeneous flow field, the speed of sound is not constant, see Sec.IV C.

C. Setup of the CDPS model in frequency domain
To determine the reference sound field, the CDPS model is taken to calculate the transfer functions from every source to each considered grid position of the sound reinforcement area in the frequency domain.These transfer functions result from the superposition of the impact of each source, e.g., monopole sources as in this article, or sources with modeled as well as measured loudspeaker directivity data.The transfer functions comprise the reference driving functions-the electronic filters which individually control the different inputs for each source-and the acoustic transfer functions-the spatial source-receiver sound wave propagation.Subsequently, the transfer functions are transformed to the time domain using the discrete inverse Fourier transform.The sound field is calculated by convolving the input time signal and the transfer functions to obtain the sound pressure for each considered grid position and for each considered time step.
Air is assumed to be homogeneous and dissipation-less with a constant speed of sound c 1 ¼ 343 m/s.The spatial locations, provided by the finite differences nodes of the CAA solver, for which the sound field is to be calculated, have to be located in the acoustical far-field of every single source.

D. Inhomogeneous velocity and temperature profile
We specify the wind and temperature profiles in this section, which are utilized in Sec.IV C as test case for nonuniform background flow.
An atmospheric boundary layer profile with constant shear stress can be approximated by the log-law (Richards and Hoxey, 1993): with u þ the friction velocity, y r a roughness parameter, and j ¼ 0.4 the Von Karman constant.The roughness parameter in case of a landscape with bushes is roughly y r ¼ 0.1 (Leclerc and Foken, 2014).A maximum wind speed of uðy max Þ ¼ À15 m/s at the edge y max ¼ 1.6 m of the computational domain is selected, which also defines the friction velocity u þ using Eq. ( 18).In case of a compressible turbulent boundary layer, the temperature profile and the density profile is related to the mean velocity profile u(y) [Eq.( 18)] by the Crocco-Busemann relation (White, 2006) FIG. 5.The instantaneous root mean square error [RMSE,Eq. (20)] between the pressure of the CAA solver and the CDPS model over the frequency of the exponential sine sweep (see case in Sec.IV A).The two vertical black lines mark the frequency range under consideration of the sweep between 1.3 and 2.7 kHz, excluding the switch on and off procedure of the adjointbased solver.
being T g ¼ T 1 % 292:80 K the ground temperature.The theoretically corresponding adiabatic ground temperature T ag is given by the assumption of linear heat flux at the ground, i.e., the condition @ y Tj y¼0 ¼ ðT max À T g Þ=y max .The selected reference temperature T max ¼ Tðy max Þ ¼ T g À 20 K at the top edge defines T 0 .Evaluating Eqs. ( 18) and ( 19) provides the velocity and temperature profiles depicted in Fig. 6.These are utilized as background mean profiles in Sec.IV C.

IV. VALIDATION EXAMPLES
Sections IV A and IV B present a circular and a linear array in the case of no flow.As a subsequent step, the linear array case is perturbed by both a representative inhomogeneous velocity and temperature background field (Sec.IV C).

A. Circular monopole array
As a first validation example, we present the sound field of a circular array, consisting of six speakers, orientated in the x 1 x 2 -plane around the center (0.8, 0.8, 0.8) m of the domain, and with a diameter of 0.8 m.A pressure snapshot of the CAA time stepper is shown in Fig. 7.The highest SPL occurs at three o'clock for the first speaker, decreasing counter-clockwise from the second to the sixth speaker (an arbitrary choice done here).
After 15 iterations of the adjoint-based optimization, the objective function is decreased by about two orders of magnitude (Fig. 2).The normalized instantaneous root mean square error (RMSE) between the CAA solution p 0 opt and the CDPS reference p 0 ref of Fig. 5, falls below 7% within the examined frequency interval.Therein, P is the sum over all spatial locations and r is the weight function [Eq.( 6)], which excludes in particular the sponge (boundary) region, starting at a distance of 0.3 m to each edge (Fig. 7).There, the CAA solution is not valid (Fig. 7).However, the small RMSE of Fig. 5 implicitly validates the quality of non-reflective boundary conditions: any spurious reflections would cross the r-area, thus, increasing the RMSE.
Beside the comparison in the time domain, the amplitudes and phases are contrasted in the frequency domain below by analyzing the driving signals of the field sources of the CDPS model and the CAA solver.
Naturally, the amplitude of the pressure source s p of the right-hand-side of the CAA equation [Eq.( 8)] is not directly comparable to the resulting pressure field.However, both are physically related by a phase shift of p/2 and an amplitude conversion factor, which is discussed in the Appendix.In contrast, the driving signal of the CDPS model is directly proportional (phase and amplitude) to the acoustic pressure field.In order to compare the CDPS and the CAA driving signals in a physically equivalent manner, we will always apply the phase and amplitude conversion just stated throughout the paper.
Figures 8 and 9 depict the input gains and the phases provided by the CAA solver.Due to the used exponential sweep (cf.Fig. 5), higher frequencies have smaller time shares of the overall signal reflected by a decaying gain with increasing frequency in Fig. 8.The phases (Fig. 9) feature periodic 2p shifts as expected.
Figures 10 and 11 show the gain and the phase differences between the CAA solver and the CDPS model within the frequency interval from 1.3 to 2.7 kHz.The maximum gain deviation error at the interval edges is 60.2 dB and less   6)] at values of 0.01, 0.5, and 0.99 are denoted by white, dashed circles (thin lines).The sponge region is indicated by a white, dashed box (thick line) at a distance of 0.3 m to the edge of the computational domain.
than 60.1 dB between 1.6 and 2.7 kHz for all six speakers (Fig. 10).Hence, also the relative pressure levels between all six speakers are correctly captured.The phase deviation between the CAA and CDPS solution corresponds to a phase shift of less than half a percent (phase shift in rad/2p) within the whole frequency interval (Fig. 11).In both figures the adjoint-based CAA solver achieves best optimization results for the loudest speaker, while the largest deviations are apparent for the quietest speaker, indicating varying sensitivities on the objective.
Thus, we can summarize that the adjoint-based CAA solver is able to provide driving functions for the circular monopole array, which are in magnitude and phase almost identical to the reference.

B. Linear monopole array
While horizontally arranged loudspeaker arrays are typically used for sound reproduction, vertical arrays are mostly used for sound reinforcement.Thus, the second validation example is a linear array with five vertically aligned monopole speakers.Again the same exponential sweep test signal is used.The equidistant speaker alignment is located on the left side (all located at x 1 ¼ 0.4 m), see Fig. 12.The loudest speaker is located at the bottom (x 2 ¼ 0.4 m), while the quietest speaker is located at the top position (x 2 ¼ 1.2 m).
For the linear array, we compare not only the source signals of the CAA solution and the reference but also the generated sound pressure at three selected positions in the target area ("mic 1"-"mic 3").This extra comparison has two reasons: First, microphone signals are another measure of the quality of the obtained sound field.Second, the CDPS model cannot deal with non-uniform background flow, as present in Sec.IV C. Consequently, there are no reference signals available to compare with.Thus, after the following discussion of the gains and the phases of the source signals, the amplitudes and phases recorded by the microphones are discussed, too.
Qualitatively, both the source gain (Fig. 13) and the source phase (Fig. 14) provided by the CAA solver for the linear array are similar to those of the circular array (cf.Figs. 8 and 9).This is expected since the same exponential sweep is reused (Fig. 4).The main difference between the linear and the circular array (Sec.IV A) is the gain offset between the speakers.The phase differs outside the investigated frequency interval only [switch on process, discrete Fourier transform (DFT) windowing].
The deviation between the CAA and the CDPS solution are comparable to those in Sec.IV A. The maximal gain deviation in Fig. 15 is below 0.1 dB.The maximal phase deviation in Fig. 16 is less than half a percent (phase difference in rad/ 2p).Again the loudest speakers are the most accurate.
Figures 17 and 18 display the SPL and the phase differences, respectively, at three representative microphone locations marked by the magenta crosses in Fig. 12, located    [1.1, 1.1, 0.8] m (mic 1), [0.8, 0.8, 0.8] m (mic 2) and [0.73, 0.63, 0.8] m (mic 3).In contrast to the speaker signals, the deviation of the microphone SPLs is increased by a factor of 10 up to 1 dB at the frequency interval edges and the deviation of the microphone phase is increased by a factor of 8 up to 4%.These deviations are also representative of other microphone locations.An explanation of the more accurate speaker signal in comparison to the microphone signal is as follows: The adjoint-based speaker signal results from an comparatively more robust integral measure J [Eq. ( 6)] of all spatial locations of r.In contrast, the pressure signal at a microphone represents a single spatial point only.The discrete Fourier transform windowing artifacts due to the short available signal length of 26 ms are another explanation, which especially affects the beginning and end of the sweep, i.e., very low and very high frequencies.
Having confirmed that also a linear array setup can be optimized, the geometrical flexibility of the adjoint approach is demonstrated.Beside the signal of the speakers itself also the resulting pressure fields at different microphone locations match the CDPS reference.

C. Linear monopole array with velocity and temperature profile
In the following the linear loudspeaker setup of Sec.IV B is extended by an inhomogeneous velocity and temperature profile, cf.Sec.III D. Inspired by a sunset festival situation in a windy desert (cf.Fig. 1) we assume a hot ground with colder air blowing above.Subsequently, the impact of these outdoor conditions on sound propagation is discussed.The end of this section deals with the optimization of the linear array in order to reinforce the original target field by compensating the impact of the velocity and the temperature profile.
Using the identical source signals as in Sec.IV B, and the background flow profiles described in Sec.III D significantly alter the sound propagation.At the ground x 2 ¼ 0, there is no flow and the speed of sound is equal to Sec.IV B. However, with increasing wall distance x 2 > 0, both, the incoming wind and the decreasing sound speed delay the sound propagation in the positive x 1 direction further and further.In the time domain the resulting delayed pressure signal at the mic 1 location is depicted as a dashed-dotted red line in Fig. 19, while the original reference signal is colored solid black.The time delay s i between the cases with and without flow of the arrival time at a "mic i" can be estimated by In addition to the time domain representation, the time shift is visible as a spectral phase shift in Fig. 20.A time shift s corresponds to a spectral phase shift of D/=2p ¼ f s.Hence, for the first approximation, we find a linear dependency on the frequency at the microphone locations, originating from the time delays f s 1 ; f s 2 ; and f s 3 as a black, blue, and green solid line, respectively.For example, at 2 kHz the time shift s 2 leads to a phase shift of 13.8%, fully consistent with the CAA value (blue solid line).As expected, the largest phase shift of around 35% occurs for the highest frequency of mic 1.For setups with more speakers and a larger audience region stronger deviations from the desired sound field are expected.Strictly speaking, the simplified explanation of the introduced time delay is only considering the effects of wave propagation in x 1 -direction assuming a single speaker.More generally, superposition effects of all five speakers including fully three-dimensional wave propagation within a boundary layer mean flow leads to more complex phenomena and possible local outliers of the predominantly linear phasing.One example of this non-linear phasing occurs at mic 3 (Fig. 20).
Beside the phase shift, the amplitude error at the examined mic positions is increased between plus and minus 3 dB as visible in Figs.19 and 21.An explanation might be a constructive or destructive interference between the waves of the five speakers, which also explains the spatial dependency of this effect.Because of the locally varying wind and temperature, the acoustic path lengths between a microphone and each speaker differ.Among the three microphones presented here, mic 3 with non-linear phasing is the most sensitive to this interference effect and displays the largest amplitude change in Fig. 21.
After discussing the effects of a non-uniform mean flow on sound propagation, now its optimization is addressed.Using the adjoint-based approach, the previous no-flow speaker signals s are modified in order to reinforce the original target field thereby compensating the effects of the nonuniform background.The resulting optimized sound fields are measured at the same mic positions as before.
In the time domain (Fig. 19) the optimized dashed green curve is clearly shifted towards the black reference curve.In the frequency domain (dashed lines of Fig. 20) the relative phase error (dashed lines) is at least halved.The SPL differences (dashed lines of Fig. 21) display only small or no (frequency dependent) improvement compared to the not optimized curves (solid lines).Meanwhile a SPL decrease over all frequencies of up to 2 dB can be seen, with the only exception of mic 3 above 2.5 kHz.In comparison, the phases are proportionally further restored by the optimization algorithm than the SPL.This is probably caused by the fact that a phase shift of p in Eq. ( 6) has a greater impact on the objective in contrast to a perturbation of the pressure magnitude.
It is worth noting that the deviation between the optimized and the reference solution is only a measure to which extent the adjoint solver is able to reproduce a given target field with the speakers available under the present conditions (wind, temperature effects).It is not a quality criterion of the sound reinforcement system itself.However, there are two possibilities, to include quality criteria in the optimization process.First, most easily, by predefining the target field according to the criteria.Or second, by extending the objective function Eq. ( 6) to meet additional criteria-besides the sound field reproduction alone.Potential criteria among others could be a homogeneous SPL coverage and avoidance of undesired radiation, or a smooth spectral distribution.
In the present paper, the driving signals were optimized for predefined fixed speaker locations.However, in general, the adjoint approach is also capable to identify the optimal speaker locations, possibly suggesting a more efficient setup than the original alignment to reinforce the same reference field, see Lemke et al. (2017) for a preliminary test to identify source locations in two dimensions.

V. CONCLUSION
An adjoint-based framework for determination of optimal driving functions of monopole speakers for sound reinforcement applications was presented.The time-domain approach allows the consideration of wind and other environmental conditions like thermal stratification.It has been shown that the approach is able to determine optimal driving functions, i.e., gain and delay, for different array configurations.
In practice, the obtained driving functions, including environmental conditions, can be pre-computed, e.g., for different wind intensities.Based on on-site measurements the sound field generation setup can then be switched between the different functions.
In a next step, the adjoint-based approach will be analyzed in the context of complex speaker directivities.For this purpose, monopole synthesis models are particularly promising because the numerical effort of the adjoint-based approach is independent of the number of sources.
This section discusses, how to relate s p of the CAA solver with the CDPS pressure signal of the monopole.
If the Euler equations [Eq.( 8)] of the CAA solver are linearized around a homogeneous background flow, the wave equation @ 2 t p 0 À c 2 0 Dp 0 ¼ sðy; tÞ (A1) follows, with the right-hand-side force term sðy; tÞ ¼ c À 1 ð Þ@ t s p ðy; tÞ: (A2) The general solution of this inhomogeneous wave equation in free space is known as (A5) being sðtÞ ¼ t À jx À yj=c 0 the delayed time between source and observer position.Hence, the time derivative of the e ixðtÞt term of the test signal causes a constant phase shift of p (imaginary i) and a time, i.e., frequency dependent amplitude relationship between the CAA source s p and the resulting pressure signal p 0 ðx; tÞ.Alternatively, if both the CDPS signal s CDPS and the CAA forcing s p are available, the frequency-dependent amplitude relation is given by the relation jF ½s CDPS ðxÞj= jF ½s p ðxÞj, being F the discrete Fourier transform.This amplitude relation is universal for all speakers.In the present paper, the average of all speakers is taken into account.Furthermore, one can show, that for low Mach numbers M < 0.1, the conversion curve is independent of the background flow.Hence, as soon as the amplitude conversion curve is calculated for one signal and Gaussian speaker shape without flow, it can also be applied in cases including wind profiles or thermal stratification.
FIG. 1. (Color online) "Mayan Warrior" sound system at the "Burning Man" festival in Nevada's Black Rock desert (with friendly permission of the Mayan Warrior Collective, http://mayanwarrior.mx).Open-air conditions including wind and thermal stratification, such as a hot ground and a cool breeze above considered in Sec.III D, are examples, where sound optimization with non-uniform flow matters.
FIG. 4. Exponential sine sweep with a monotonically increasing frequency in the time domain-the loudspeaker input signal used throughout this work.All other speakers use the same signal sequence varying amplitude and phase.
FIG. 6. (Color online) Exemplary velocity and temperature profiles in a windy desert at sunset [result of Eqs.(18), (19)].The height x 2 of the three microphone positions-examined in Sec.IV C-is marked with magenta crosses.

FIG. 7
FIG. 7. (Color online) CAA solution of the sound field radiated by six speakers in a circular arrangement.A snapshot of pressure fluctuations of the central x 1 x 2 -plane (x 3 ¼ 0.8 m, where all speakers are located) is shown.The locations of the six monopole sources of pressure are marked by black crosses.Isolines of the objective weight function r [Eq.(6)] at values of 0.01, 0.5, and 0.99 are denoted by white, dashed circles (thin lines).The sponge region is indicated by a white, dashed box (thick line) at a distance of 0.3 m to the edge of the computational domain.
FIG. 9. CAA solution of the input phase of the loudest speaker of the circular array.All other speakers show similar behavior.The two vertical black lines define the frequency interval of the sweep between 1.3 and 2.7 kHz.

FIG. 10
FIG. 10. (Color online) Difference between the CAA and the CDPS narrowband input gains of the speakers for the circular array.

FIG. 8
FIG. 8. (Color online) CAA solution of the input gains for all speakers of the circular array.The SPL is normalized at the loudest frequency of the first speaker.
FIG. 13. (Color CAA solution of the input gains for all speakers of the linear array.The two vertical black lines define the sweep frequency interval between 1.3 and 2.7 kHz.

FIG. 19
FIG.19.(Color Comparison of the pressure amplitudes in the time domain at "mic 1" in the case with and without a flow (and temperature) profile, and each case with and without enabled optimization.

FIG. 17
FIG. 17. (Color online) Difference between the SPLs of the CAA and the CDPS solver at three microphone locations described by the legend and illustrated by Fig. 11.
ÀðyÀy 0 Þ 2 r À2 e ixðtÞt ; (A4) instead of monopole source, being x(t) the time-dependent frequency function of the test signal.Strictly speaking, the CAA speakers should be called Gaussian.Throughout the present work, a point-like Gauss in relation to the shortest acoustic wavelength (3 kHz) is used: r < k.Hence the CAA speakers are approximate monopoles.Combining the latter equations and executing the time derivative provides p 0 ðx; tÞ ¼ i