Cross-linguistic vowel variation in trilingual speakers of Saterland Frisian, Low German, and High German

The present study compares the acoustic realization of Saterland Frisian, Low German, and High German vowels by trilingual speakers in the Saterland. The Saterland is a rural municipality in northwestern Germany. It offers the unique opportunity to study trilingualism with languages that differ both by their vowel inventories and by external factors, such as their social status and the autonomy of their speech communities. The objective of the study was to examine whether the trilingual speakers differ in their acoustic realizations of vowel categories shared by the three languages and whether those differences can be interpreted as effects of either the differences in the vowel systems or of external factors. Monophthongs produced in a /hVt/ frame revealed that High German vowels show the most divergent realizations in terms of vowel duration and formant frequencies, whereas Saterland Frisian and Low German vowels show small differences. These findings suggest that vowels of different languages are likely to share the same phonological space when the speech communities largely overlap, as is the case with Saterland Frisian and Low German, but may resist convergence if at least one language is shared with a larger, monolingual speech community, as is the case with High German.


I. INTRODUCTION
Over the past few decades, the acoustic study of vowel realization has developed into one of the most fruitful areas for the study of multilingualism and language contact (Guion, 2003;Bullock and Gerfen, 2004;. While the main focus of this field of research is on bilingualism, the subject of this study is the less known phenomenon of trilingualism. Trilingualism is particularly interesting because it offers the possibility of examining the interaction of an L1 with an L2 and L3 which differ not only by language-internal factors such as the size of their vowel inventories, but also by language-external factors such as age of acquisition or social status.
The participants of the present study are drawn from the rural municipality of the Saterland in northwestern Germany. They are trilingual with Saterland Frisian, Saterland Low German, and Saterland High German. Saterland Frisian is the last living variety of East Frisian.
Saterland Low German is the local variety of Low German, and Saterland High German is the local variety of Northern Standard German spoken throughout northern Germany. 1 The three languages of the Saterland differ in the size of their vowel inventories. Whereas Saterland High German has 15 phonemic monophthongs in stressed syllables and 3 phonemic diphthongs, Saterland Low German has 17 monophthongs and 7 diphthongs, and Saterland Frisian has up to 20 monophthongs and 7 diphthongs (cf. Bussmann, 2004;Fort, 2015;Kramer, 1982;Peters, 2017;Schoormann et al., 2017a). Previous research suggests that vowel systems of different sizes may give rise to a reorganization of the vowel space in multilinguals resulting in different locations of shared vowel categories in the F1-F2 space (e.g., Lindblom, 1986;Guion, 2003). The three languages also differ from each other with regard to extra-linguistic factors. Whereas the use of Saterland Frisian and Saterland Low German is largely restricted to communication among local people, Saterland High German is a variety of the national standard language. 2 Accordingly, Saterland speakers of High German do not form an autonomous speech community.
The objective of the present study is to examine whether the three languages differ in the realization of shared vowel categories, and whether such differences can be explained better by language-internal or language-external factors.

A. Background
There is growing evidence that multilingual speakers develop vowel systems that are autonomous but interact with each other such that acoustic realizations of vowel categories in one or more of the languages may differ from those of monolingual speakers. Several language-internal and language-external factors have been identified that determine the direction and extent of this interaction.

Language-internal factors
Among the internal factors are the size of the vowel inventories and the auditory and acoustic similarity of vowel categories in the languages of multilingual speakers. Languages with large vowel inventories face the task of preventing perceptual confusion of distinct vowel categories in a crowded vowel space. According to the Theory of Adaptive Dispersion, vowel inventories are structured so as to maximize perceptual contrast (Liljencrantz and Lindblom, 1972) or to guarantee at least sufficient contrast between the vowel categories (Lindblom, 1986). This may be achieved by increasing the overall size of the vowel space and distributing the vowels evenly in the F1-F2 space. Jongman et al. (1989) found that the vowel space in monolingual speakers of German with 15 monophthongs increased when compared to monolingual speakers of Greek with 5 monophthongs. Similarly, Bradlow (1995) found a larger vowel space for English (11 monophthongs) than for Spanish and Greek (both 5 monophthongs). Al-Tamimi and Ferragne (2005) found a larger vowel space in French (11 vowels) than in Moroccan (5 vowels) and Jordanian Arabic (8 vowels). Livijn (2000), however, found an effect of the size of vowel inventories only in languages with 11 or more monophthongs (see also Becker-Kristal, 2010, Chap. 4).
Another source of variation in vowel realization is a different base-of-articulation in monolinguals who provide all or part of the input to the multilingual child. Bradlow (1995) found that vowels of English monolinguals are articulated with a more fronted tongue position than those of Spanish monolinguals. Similarly, Guion (2003) found an upward shift of Quichua vowel categories relative to Spanish vowels in simultaneous Quichua-Spanish bilinguals.
In bilingual speakers, the phonetic realization of vowels in an L2 depends not only on the phonetic realizations of their L2 input but also on the number and quality of the L1 vowels. According to Flege (1995Flege ( , 2007, the L1 and L2 phonetic subsystems of bilinguals interact through two distinct mechanisms: phonetic category assimilation and phonetic category dissimilation. Acoustically or perceptually similar sounds in the L1 and L2 are more likely to lead to category assimilation in bilinguals, whereas more distant sounds are more likely to lead to the formation of distinct categories in the L1 and L2 of bilinguals (cf. Bohn and Flege, 1997). According to Flege's speech learning model, phonetic category dissimilation occurs because bilinguals strive to maintain phonetic contrasts between all of the vowels of their combined L1-L2 phonological space (Flege, 1995;Flege et al., 2003; for a similar view of the extended Perceptual Assimilation Model see Best and Tyler, 2007).

Language-external factors
Among the external factors that determine the extent and direction of L1-L2 interaction are the age of acquisition, the amount of L2 experience, language dominance, the communication range of the two languages, prestige, and the autonomy of speech communities. Both the age of acquisition and the amount of L2 experience strongly contribute to the similarity of L2 vowels to those of monolinguals. The earlier and more intense the exposure to both languages, the more likely it is that a bilingual will produce distinct acoustic realizations of L1 and L2 vowels and establish separate vowel categories. For example, Guion (2003) found that Quichua-Spanish bilinguals distinguished L1 and L2 vowel categories if they had acquired Spanish at a younger age. All simultaneous bilinguals, most early bilinguals, and more than half of the mid bilinguals distinguished front and back vowels of both languages, which indicates the formation of language-specific vowel categories. Late bilinguals, on the other hand, tended to produce Quichua-like Spanish vowels. In addition, the age of acquisition was found to affect L1 categories. Whereas simultaneous bilinguals realized both L1 and L2 categories in a native-like manner, early and mid bilinguals who had established separate vowel categories for L1 and L2 were found to realize L1 vowels higher in the vowel space than monolinguals. The age of acquisition was likewise found to have an effect on the size of the vowel space in Quechua-Spanish bilinguals. O'Rourke (2010) found a larger overall vowel space in late Quechua-Spanish bilinguals than in early bilinguals (see also Bohn and Flege, 1992;Munro et al., 1996;Flege et al., 1997;Baker and Trofimovich, 2005;MacLeod et al., 2009;Haimes-Kusomoto, 2010).
The formation of separate vowel categories for L1 and L2 does not imply that all vowels are realized in a monolingual-like manner. The phonetic system of the early English-French bilingual in Mack's (1989) study approximates, but does not match, that of the English monolinguals. Similarly, Mayr and Montanari (2015) found cross-language differentiation of English, Italian, and Spanish monophthongs for early trilingual children but also differences from the realizations of adult speakers who provided the main linguistic input in early childhood. Flege et al. (2003) report that early Italian-English bilinguals tend to produce English (L2) vowels in a more monolingual-like manner than did late bilinguals. In addition, the authors found that the amount of L2 experience affected vowel realization. Speakers who used their L1 less after immigrating to Canada tended to produce L2 vowels that were more monolingual-like than those who used their L1 more frequently. On the other hand, early bilinguals' vowel realizations were often not identical to those of monolinguals, which may be reduced to the mechanism of phonetic category dissimilation. In order to keep categories of acoustically similar vowels in the two languages maximally distinct, bilinguals may exaggerate acoustic features such as dynamic spectral properties of English /e I /, which was kept distinct from Italian /e/ by exaggerating its diphthongal quality.
In many cases, the time of first acquisition of the L2 and the amount of exposure to the L2 appear to be closely correlated so that it is difficult to distinguish between them. Mora et al. (2015), however, illustrate the independence of these two factors in Catalan-Spanish bilinguals living in Barcelona and differing in the frequency of Spanish use. The more frequently Catalan was used, the less Spanish-like were Catalan /E/ and /O/, and the greater was the Euclidian distance between Catalan /e/ and /E/ in the F1-F2 space. Earlier, Mora and Nadeu (2012) had reported a similar influence of frequency of use on both the production and perception of Catalan /e/ and /E/.
In a number of studies, the age of acquisition was found to interact with internal factors such as the similarity of L1 and L2 sounds. Bohn and Flege (1992) found that in native German speakers the effect of L2 experience was smaller in the realization of similar English vowels than in the realization of new vowels. Baker and Trofimovich (2005) found that the degree and direction of L1-L2 influences in early and late Korean-English bilinguals appeared to depend on the degree of acoustic similarity between L1 and L2 vowels but that cross-language similarity was more likely to influence the late, than the early, bilinguals. A comparison of the Korean-English bilinguals with Korean monolinguals revealed that late bilinguals produced acoustic differences only for those L1-L2 vowels that were highly dissimilar, whereas early bilinguals produced acoustic differences between all L1-L2 vowels except for those that completely overlap in the acoustic space.
There are other external factors that affect the interaction between L1 and L2, such as the social status of the two languages and their communication range. In many studies, L2 is the language with the higher prestige and a wider communication range, which in turn is likely to affect the amount of L2 exposure and frequency of use of the L2. Both aspects are usually associated with language dominance. The dominance of L2 makes it likely that the L2 influences the L1 rather than vice versa, as is the case with speakers of Frenchville who are bilingual with Frenchville French and English (Bullock and Gerfen, 2004).
A related issue that has received little attention in previous studies is the size and autonomy of the speech communities of the L1 and L2. When both L1 and L2 are used by large and autonomous speech communities, as is the case with, e.g., English and Italian, there are distinct groups of monolingual speakers whose vowel productions can serve as independent models for vowel acquisition in the L1 and L2. When L1 is a minority language it is often the case that there is no monolingual L1 speech community. An example is Welsh, which has been in contact with English for centuries.  found strong cross-linguistic convergence between the vowels of the two languages of Welsh-English bilinguals.
When many or all speakers of the L1 are early or late bilinguals with a second language it can be difficult to determine the influence of L2 vowels on L1 vowels as there are no instances of monolingual L1 categories. Even more interesting is the case of balanced bilingual communities where all speakers are bilingual with languages, or dialects, that are used in different domains or social groups within the community but not shared by a larger autonomous speech community of monolingual speakers. The question arises whether language-specific categories persist in such speech communities or whether convergence is inevitable.

B. The linguistic landscape of the Saterland
The Saterland is located in the northwestern corner of the federal state of Lower Saxony in North Germany (see Fig. 1). It was settled by Frisians from East Frisia at around 1100 AD (Fort, 2001). Until the 19th century it was an enclave, which was largely isolated form the surrounding Low and High German-speaking region. Present-day Saterland offers the unique opportunity to study speakers who are trilingual with three West Germanic languages that differ both in the complexity of their sound systems and in language-external factors such as their social status and the autonomy of their speech communities.
Saterland Frisian (SF) (sfrs. Seeltersk) is the last living variety of the East Frisian language, which formerly was spoken along the North Sea coast between the rivers Lauwers in the northeast of the Netherlands and Weser in the northwest of Germany including the Land Wursten south of Cuxhaven. There are three local dialects of SF spoken in and around the communities Ramsloh (Roomelse), Str€ ucklingen (Strukelje), and Scharrel (Sk€ addel). These dialects are mutually intelligible, and there is no standard variety. SF is an endangered language, with an estimated number of 1500-2500 speakers (Stellmacher, 1998;Fort, 2015, p. XIII).
Saterland Low German (LG) is the local variety of Low German. Low German is a native language in northern Germany and closely related to Dutch Low Saxon in the northeast of the Netherlands (Lenz et al., 2009). It is also found in several linguistic exclaves in and outside Europe. Almost all speakers of Low German in northern Germany are bilingual with Northern Standard German. There is no standard variety and no monolingual speech community of Low German inside or outside the Saterland. In the 15th century, East Frisian Low German began to displace all East Frisian dialects in northwestern Germany except Saterland Frisian (Niebaum, 2001). The Saterland is surrounded by Low German dialects. Saterland Low German is a mixture of Low German dialects spoken in the Oldenburger M€ unsterland and in Emsland (Fort, 1997).
Saterland High German (HG) is a variety of Northern Standard German, which is the model language in the German educational system and the dominant language of the national media. The frequency of use of HG in the Saterland rose tremendously with the arrival of displaced persons since World War II (Fort, 2004).
When the Saterland was settled by East Frisians around 1100 AD, this region was already inhabited by Low German-speaking settlers. Therefore, there has been longterm contact between Frisian and Low German in the Saterland. High German had been spreading in northern Germany since the 16th century and entered the Saterland much later. It is therefore not surprising that SF is linguistically much closer to LG than to HG. SF has a significant number of loans from LG and the complexity of the consonant and vowel inventories of SF is partially based on such loans. One example is the word-initial opposition between the voiced and voiceless alveolar fricative, which is due to the fact that native SF words with word-initial /s/ coexist with LG loans with word-initial /z/ (Fort, 1997(Fort, , 2015. The different linguistic character of the three languages in the Saterland is most evident in their vowel systems. Figure 2 shows the inventories of monophthongs of SF, LG, and HG in stressed syllables. HG has 15 phonemic monophthongs in stressed syllables and three phonemic diphthongs, whereas LG has 17 monophthongs and 7 diphthongs, and SF has 20 monophthongs and 7 diphthongs. Both SF and LG differ from HG by having a complete set of long open mid vowels /E+ oe+ O+/, which in SF are largely restricted to loans from LG. In Northern High German, /E+/ is largely restricted to spelling pronunciation and careful speech. In more informal speech, /E+/ tends to be merged with /e+/ (cf. Bohn and Flege, 1992;Jørgensen, 1969;Kohler, 1995, p. 172;P€ atzold and Simpson, 1997;Steinlen, 2005, p. 79). Additionally, SF is reported to have a complete set of short tense close vowels /i y u/, which are in opposition with long tense close /i+ y+ u+/ and short lax close /I Y U/ (Sj€ olin, 1969, p. 67;Kramer, 1982;Fort, 2015, p. XV). In recent studies, the short tense close vowels were found to be merged with the long tense close vowels except in some older conservative speakers and in careful speech (Tr€ oster-Mutz, 2002;Heeringa et al., 2014;Peters, 2017;Schoormann et al., 2017a).
From a sociolinguistic perspective, SF is also closer to LG than to HG. This difference is noticeable in both individual and societal multilingualism. Most, if not all, speakers of SF born before the 1980s are early trilingual with LG and HG as L2 and L3, respectively (Fort, 2004). SF is usually acquired at home and used as a primary language in communication with other speakers of SF. Accordingly, the use of SF is restricted to communication within the local population.
LG is acquired at home alongside SF or through contact with relatives, neighbors, and friends. Trilingual SF speakers use LG in communication with locals who do not speak SF, whereas LG speakers who do not speak SF use LG in communication with both LG speakers and the trilingual speakers of SF. To some extent, LG is also used in communication with Low German-speaking people from neighboring settlements. Note, however, that until recently there was little social contact with East Frisian Low German because the great majority of the Frisian population of the Saterland is Roman Catholic. Most East Frisians belong to the Lutheran or Dutch Reformed churches.
HG is used by trilingual speakers of SF in communication with non-local people and local people who neither speak SF nor LG. It is acquired at the latest when entering primary school. However, usually SF speakers come into contact with HG earlier through HG monolinguals in the neighborhood and through the media, in which HG is by far the most commonly used language. HG is also the dominant language in the educational institutions of the Saterland. Only recently have there been efforts to establish SF as a second language in day care centers and schools. Most younger speakers of SF born since the 1980s are no longer trilingual but bilingual with HG as their second language.
The difference in the communication range of the three languages conforms to the size and autonomy of the respective speech communities. 3 Whereas the speech communities of SF and LG are largely restricted to the Saterland and the closer environment, Saterland speakers of High German are part of the larger group of the (mostly monolingual) speakers of the national standard language in northern Germany and do not form an autonomous speech community in this respect. It is therefore to be expected that HG is influenced more strongly by speakers from outside the Saterland than by SF or LG.
The results of a survey by Stellmacher (1998) suggests that the difference in the size and autonomy of the speech communities corresponds to a difference in prestige. Among all Saterlanders, HG was most frequently mentioned as the favorite language (45.8%), closely followed by LG (40%), whereas only 13,8% mentioned SF. Even within the Saterland Frisian-speaking community, SF does not seem to be highly evaluated. Less than 25% of community members reported using SF in conversations with their own children and grandchildren. One source of the higher prestige of HG may be the fact that, as a national language, HG is subject to official regulation and develops at a national level. Furthermore, HG is commonly used at all levels of education and as an official language in more formal situations.

C. Aim of this study
The present study focuses on the subject of the variation of the acoustic realization of monophthongs in trilingual speakers of SF, LG, and HG and the role of internal and external factors in this variation. One internal factor to be considered is vowel inventory size. After the merger of short and long tense vowels, SF and LG share the same set of monophthongs but differ from HG in having a complete series of long open-mid vowels /E+ oe+ O+/, whereas HG has /E+/, which is possibly restricted to reading pronunciation. External factors to be considered are the amount of L2/L3 experience, the social status of the languages and the autonomy of their speech communities. More specifically, we examine the following questions: LG, and HG differ in the phonetic realization of shared monophthongs? (ii) If there are differences in the phonetic realization of shared monophthongs, can these differences be explained better by differences in the vowel inventories, and in particular by the presence or absence of a complete series of long open-mid vowels, or with recourse to the amount of L2/L3 experience, the social status of the three languages, or the autonomy of the respective speech communities?
We will include speakers of different age groups to account for the possible effects of an ongoing change in one or more of the three languages.
Both the language-internal and external differences between SF, LG, and HG suggest that HG vowels will differ more from SF and LG than SF from LG. However, the internal and external factors may lead to different forms of crosslinguistic variation. A possible effect of the presence of the additional series of long open-mid vowels in SF and LG is a different location of the vowels in the lower half of the vowel space when compared to HG. In this case, no specific differences in vowel duration would be expected. Differences in the autonomy of the speech communities, on the other hand, may account for differences in vowel duration and spectral properties, which can be better explained by different articulatory settings of speakers from outside the Saterland than by differences in the vowel inventories.
To examine the possible effects of language-internal and language-external differences, vowel duration and static and dynamic formant patterns will be analysed. It has been found that vowel duration can be used as an acoustic cue to enhance the distinctiveness of high vowels, especially in varieties with rich vowel inventories such as the North Frisian dialect of Fering. Bohn (2004) notes that /i+/ and /u+/ are close in the F1-F2 space to /e+/ and /o+/ but show a clear difference in acoustic duration that exceeds the duration differences between other degrees of vowel height. The same observation has been reported by Schoormann et al. (2017a) for Saterland Frisian.
Cross-linguistic differences in formant frequencies may reflect a reorganization of a crowded vowel space or the different articulatory settings of monolingual speakers of the languages concerned (Bradlow, 1995). For the three languages of the Saterland, it is conceivable that /ø+/ and /o+/ are raised in SF and LG so as to increase the spectral distance to /oe+/ and /O+/, which are present in SF and LG but missing in HG. To detect spectral differences that affect only parts of these vowels, we will compare formant frequencies at 20%, 50% and 80%. Formant dynamics, or vowel inherent spectral change (VISC, Nearey and Assmann, 1986), may play a role in enhancing or maintaining vowel contrasts in a crowded vowel space (cf. Adank et al., 2004;Watson and Harrington, 1999;Clopper et al., 2005;Fox and Jacewicz, 2009). To examine cross-linguistic differences in formant dynamics, we will calculate overall trajectory lengths and the spectral rate of change (Fox and Jacewicz, 2009).

A. Participants
We recorded 12 older and 7 younger male speakers in Scharrel 4 who were trilingual with SF, LG, and HG. The older speakers were aged between 50 and 75 years (mean age 57.7) and the younger speakers between 21 and 34 years (mean age 26.6). One older and two younger speakers were discarded due to missing data. Note that it was not possible to recruit a larger number of younger trilingual speakers in Scharrel as it turned out that most speakers of SF under 30 years were bilingual rather than trilingual, with HG as their second language. All findings on age effects, or their absence, which are reported in the remainder of this paper, must therefore be treated with caution.
The trilingual speakers had lived in Scharrel all, or the majority of, their lives. All subjects except one younger speaker who acquired SF as a second language after HG, considered SF to be their native and primary language and acquired it at home.
LG was acquired at home or in the neighborhood, except for two speakers, who had acquired LG at work. According to self-reports of our participants, HG was acquired at home, in the neighborhood, or at primary school but it is very likely that all of the participants have already come into contact with HG before entry into primary school, especially in the context of radio and television.
Even though our subjects differed somewhat in their reports on the order and the age of acquisition of LG and HG, all Scharrel subjects except two may be categorized as early sequential trilinguals in the sense of Sundara and Polka (2008) as these speakers were exposed extensively to all three languages from early childhood on. Most subjects reported that they used each of the three languages with more than 30 conversational partners. SF is used in conversations with close relatives, neighbors, and friends, whereas LG and HG are mainly used in conversations with neighbors, friends, and at work.
Although SF was acquired as a first language and is considered by nearly all of our participants to be the family language, it is not clear whether SF can be regarded as a dominant language. If we look at dominance on the basis of usage (Flege et al., 2002), findings suggest that HG is gaining dominance within the community of SF speakers. Only two of our subjects named SF as their most commonly used language. HG, on the other hand, was named by nine participants. These results are supported by the survey of Stellmacher (1998), according to which only one-third of the estimated 2250 SF speakers in the Saterland regard SF as their most commonly used language. In the whole municipality of the Saterland, SF is the most commonly used language of only 9% of the inhabitants, compared to 40% preferring LG and more than 50% preferring HG. However, there are still intact SF-speaking speech communities consisting of several families residing within the three villages of the Saterland and on farms outside the villages.

B. Materials and procedure
We elicited the complete set of monophthongs of each of the three languages. To control for phonetic context effects, we elicited the vowels in a /hVt/ context. As /a+/ is missing in the Scharrel dialect before syllable-final alveolar plosives, there remain 14 shared vowel categories, which are /i+ y+ u+ e+ ø+ o+ E+ I Y U E oe O a/. The vowels of each language were obtained in three separate sessions, which were separated by at least two months. To ensure the intended language mode (Grosjean, 2013), a local assistant (who used the recorded language as his or her primary language) guided the participants through the experiment. Hence, the target language of each session corresponded to the language used in everyday conversations between the assistant and the subjects of our study.
The /hVt/ words were elicited via rhymes (cf. Bohn, 2004;Mayr and Davies, 2011;. In this context, the informants were first instructed to read aloud a monosyllabic word of the target language displayed on a computer screen and then to produce the rhyming /hVt/ target word. For example, in order to obtain [e+] in Saterland Frisian, first the Saterland Frisian word leet "late" was shown, together with its High German translation. The subject read [le+t]. Subsequently, the frame H_t was added below leet as an aid to form the target word, and the translation disappeared. Since leet is pronounced as [le+t], the subject built the rhyming target word and pronounced H_t as [he+t]. If there was no monosyllabic rhyming real word available for triggering the target word, an intermediate form was shown between the trigger and the target word. For example, in order to obtain Saterland Frisian [oe] from the trigger l€ oskje "extinguish," the intermediate non-real word l€ ott was added in a second step. The intermediate form then led to the production of the rhyming target word [hoet]. The elicitation via rhymes was preferred over a reading task because both SF and LG orthography were unknown to our speakers and the written form may have had a direct influence on the production data.
For each subject, the sequences of trigger and target words were presented in two consecutive blocks on a computer screen. The first block elicited the two-step sequences, the second the three-step sequences. Sequences were presented in a controlled randomized order, which differed for each subject, and which ensured that a vowel was never directly succeeded by the same vowel in the following sequence.
Three /hVt/ samples were obtained per speaker and per vowel in two test series, one in which monophthongs were interspersed with diphthongs and one in which the monosyllabic /hVt/ words were interspersed with disyllabic words. 5 Both series of words obtained in two steps and three steps were preceded by six practice sequences, so that informants became familiarized with the test. Preceding the experiment, a vocabulary list consisting of the trigger words mixed with other words of the basic vocabulary was used for a warm up and to familiarize the speakers with the SF and LG triggers and unknown spelling.
Note that despite a number of target words matching real words in the respective target languages, the majority were non-words. The possible influence of real words was reduced by the fact that the target words were not presented completely, but with a gap in place of the vowel (e.g., H_tt for /hEt/). Moreover, some of the target words matched inflected forms of real words which are not expected to occur in isolation, as in the elicitation task (e.g., LG hat, "have-PP"). Disregarding these cases, there remain four matching words for SF (h€ ad, "hard," heet, "hot," Hoat, "hate," Haat, "heart"), five for LG (h€ u€ ut, "today," Huud, "skin," heet, "hot," Hoot, "hat," Haat, "hate") and one for HG (Hut, "hat"). No patterns of acoustic variation were observed that could be attributed to the matching of these target words with real words.
All informants were instructed not to overarticulate but to pronounce the target word in a more habitual style. The productions were monitored for the target pronunciation and intonation, ensuring that they were elicited with a falling contour. Where mistakes occurred, individual sequences were repeated at the end of each recording session. Approximately 4% of the data are missing, most of which originate from mispronunciations or unfamiliarity with the trigger words.
All recordings were made in individual sessions in a quiet room in Scharrel with a Tascam HD P2 digital recorder using a sampling rate of 48 kHz (24 bit) and a head-mounted microphone (DPA 4065 FR).

C. Acoustic measurements
Measurements of vowel duration and of F1 and F2 were done with PRAAT (Boersma and Weenink, 2014). The onset and offset of the vocalic segments were labeled manually for each /hVt/ word. Vowel onset was measured at the negative-to-positive zero-crossing before the first positive peak in the periodic waveform. Vowel offset was set at the last negative-to-positive zero-crossing before the reduction of the intensity and/or end of periodicity in the waveform before the stop closure.
The frequencies of F1 and F2 were estimated automatically by means of a PRAAT script at equidistant points (20%, 50%, and 80%) in the vowel duration. Window length was set to 0.025 s. Formant settings for the LPC analysis were adapted for each realization individually in the script by decreasing or increasing the LPC order in steps of 1 (default order 10) and the maximum frequency in steps of 500 Hz (default 5000 Hz). Upon visual inspection, the LPC settings that best tracked the first three formants were used. Outliers, which derived from measurement errors, were corrected by hand. Formant frequencies were normalized by converting all values to the Bark-scale, using the formula by Traunm€ uller (1990). As the F3 values of open vowels did not indicate substantial differences in vocal tract length between the older and younger speakers, no further normalization was applied (cf. Guion, 2003).
To analyze formant dynamics, the amount of vowel inherent spectral change was assessed as the trajectory length (TL) per vowel. TL was calculated as the sum of the two vowel sections lengths (VSL), i.e., the Euclidean distances between the measurement points at 20% and 50% of vowel duration (¼VSL 50-20 ) and at 50% and 80% (¼VSL 80-50 ) in the F1-F2 plane as introduced in Fox and Jacewicz (2009) (cf. Jin and Liu, 2013), To account for dynamic changes in unnormalized time, the spectral rate of change (roc) was calculated for the overall trajectory length (TL roc) between the 20% and 80% measurement point, i.e., within the central 60% of the vowel's duration, following Fox and Jacewicz (2009) (3)

D. Statistical analysis
Per vowel category, we performed linear mixed effects analysis in R (version 3.3.1, R Development Core Team, 2016) using the function LMER from the LME4 package (Bates et al., 2015) with LANGUAGE (SF, LG, HG), GENERATION (G1, G2), and REPETITION 6 (1, 2, 3) as fixed factors, random intercepts for SPEAKER, and by-speaker random slopes for the effect of LANGUAGE and REPETITION. 7 The random slope allows the relationship between LANGUAGE (and/or REPETITION) and the dependent variable to be different for each speaker. For example, for one speaker and one dependent variable, HG and LG may be similar and SF more deviant, whereas for another dependent variable, HG may be deviant and SF and LG similar. As dependent variables, we included duration, F1 and F2 at the three measurement points, amount of VISC, and spectral rate of change. For the comparison of vowel space sizes, we calculated the sizes of the vowel spaces per speaker and per language for the sets of shared monophthongs (using CONVEXHULLAREA from the PHONR package, McCloy, 2016). Subsequently, the vowel space sizes were compared in another linear mixed effects analysis with LANGUAGE (SF, LG, HG) and GENERATION (G1, G2) as fixed factors and random intercepts for SPEAKER. For each analysis, the goodness of fit of the final model was measured with the Akaike information criterion (AIC). The lower the AIC, the better the model. Function GLHT from the R package MULTCOMP (Hothorn et al., 2008) and function LSMEANS from the R package LSMEANS (Lenth, 2016) were used for post hoc pairwise comparisons. All p-values were calculated using the Satterthwaite approximation in the LMERTEST package (Kuznetsova et al., 2016).

III. RESULTS
Acoustic measurements of mean vowel durations and formant frequencies for the /hVt/ words are shown in Table I (older generation) and Table II (younger generation) for the three languages.
Note that Tables I and II contain the full set of monophthongs of the three languages, except /@/, which is restricted to unstressed syllables. Vowels /i/, /y/, /u/, /oe+/, /O+/, and /a+/ are not shared by all three languages in monosyllables and will therefore not be considered in the remainder of this paper. The analyses by Heeringa et al. (2015) and Schoormann et al. (2017a) suggest that the participants of the present study have merged the short tense /i/, /y/, and /u/ with long tense /i+/, /y+/, and /u+/ in both duration and formant frequencies.
A. Duration   Figures 3 and 4 show mean durations of the monophthongs shared by SF, LG, and HG in the older and younger speakers, respectively. Tense close /i+/, /y+/, and /u+/ are shorter than the other tense vowels including /E+/, and lax near-close /I/, /Y/, /U/ are shorter than the other lax vowels. Table III reveals cross-linguistic differences mainly for the long tense vowels. Tense non-open vowels were found to be longer in HG than in SF and LG, except for HG and LG /o+/. No cross-linguistic differences were found for lax vowels, except for /U/, which is longer in HG and LG than in SF.  There are no durational differences between the older and younger speakers.

B. Formant frequencies
1. Frequencies at vowel center Figure 5 shows mean frequencies of F1 and F2 of shared monophthong categories in the older and younger speakers. 8 Cross-linguistic comparisons in Table IV show that most differences in F1 are found in the long tense vowels. All long tense HG vowels are higher than their LG counterparts at one or more measurement points. Long tense HG /i+/, /u+/, /o+/, and /E+/ are also higher than their SF counterparts, and long tense SF /y+/, /e+/, /ø+/, and /o+/ are higher than their LG counterparts. For the lax vowels, fewer differences are attested but all differences found are in the same direction as in the tense vowels. Note that most differences in tense vowels are found at 50% and 80% of vowel duration and in lax vowels at 80%, which suggests that the rarer occurrence of differences in lax vowels could be related to their shorter duration. In fact, in lax vowels there is less time for the three languages to develop different spectral dynamic patterns. The 80% measurement point of the lax vowels corresponds in absolute numbers to a measurement point between 20% and 50% of the tense vowels.
Cross-generational comparisons of F1 values in Table V show that younger speakers raise close or near-close /y+/, /I/, and /Y/ but lower half-open /E/ and open /a/ compared to the older speakers. Comparisons of F2 values suggest that the front vowels /i+/, /I/, /e+/, /E+/, /E/, and /oe/ of the younger speakers are more fronted compared to the older speakers, but this also applies to back /U/.
Finally, we compared the overall vowel spaces of the three languages. Calculations were carried out with the F1 and F2 values of the shared monophthongs. Table VI reveals a larger vowel space for HG than for SF and LG at all measurement points (20%, 50%, and 80%) and a larger vowel space for the younger generation than for the older TABLE III. Cross-linguistic comparisons of mean durations of shared vowel categories, where ">" indicates that the first language has a larger mean duration than the second. generation at 50% of vowel duration. No interaction between language and speaker age was found.

Dynamic spectral features
Statistical results for the trajectory length (TL) and spectral rate of change (TL roc) are shown in Table VII for each vowel category. No difference was found for TL. Differences in TL roc are restricted to close tense /i+/, /y+/, and /u+/, and to close-mid /o+/, which have a lower TL roc in HG than in SF. No effect of the age of the subjects was found.
As no significant differences were found for the amount of spectral change (TL), the higher TL roc must be due to differences in vowel duration. Indeed, the results mirror the differences in vowel duration observed for the close tense vowels. Whereas in HG the longer duration of the close tense vowels leads to a lower TL roc, in SF a shorter duration of these vowels results in an increased TL roc. The same relationship is found for HG and LG, but reaches significance only for the groupwise comparison of TL roc of close vowels (HG > LG*). In sum, the results suggest that spectral dynamics is not an independent dimension of cross-linguistic variation.

IV. DISCUSSION
The acoustic analysis of the trilingual productions of monophthongs shared by SF, LG, and HG has revealed cross-linguistic differences in vowel duration and spectral properties. All long tense monophthongs, except /E+/, were longer in HG than in SF, and all monophthongs except /E+/ and /o+/ were longer in HG than in LG (see Table III). A similar pattern of cross-linguistic variation was found for spectral properties. Most long tense vowels and some short lax vowels were realized higher in HG than in SF and LG, that is, with a lower F1 (see Table IV). In no case was a vowel realized lower in HG than in SF or LG. Additionally, tense and lax front vowels tended to be fronted in HG when compared to SF and LG, and tense back vowels tended to be realized further back. This finding suggests an expansion of the HG vowel space in the F2 dimension, which is in line with the general finding reported in Table VI that the vowel space in HG is larger than in both SF and LG at all measurement points (20%, 50%, 80%). Spectral differences between SF and LG are largely restricted to the F1 dimension, with a more close realization of most monophthongs in SF than in LG. No clear pattern emerges for the F2 dimension, with SF /y+/ being more retracted and SF /e+/ more fronted than their LG counterparts. A number of tense and lax vowels were found to be raised and/or fronted in the younger speakers when compared to the older speakers but no interaction was found between speaker age and language (see Table V). A related finding was that younger speakers had a larger vowel space at 50% of vowel duration than older speakers (see Table VI).
No differences in spectral dynamics were found, except a smaller spectral rate of change (TL roc) in /o+/ and in the long tense close SF vowels /i+ y+ u+/, which is probably due to their shorter duration (see Table VII).
The cross-linguistic variation in the phonetic realization of monophthongs of Saterland trilinguals supports the notion TABLE IV. Cross-linguistic comparisons of mid formant frequencies of shared vowel categories, where ">" indicates that the first language has a greater mean formant frequency than the second. Raised numbers indicate the measurement point (20%, 50%, 80%). All differences are significant at a 0.05.   LG > HG  LG > HG 50 LG > HG  LG > HG 20-50-80 LG > HG 20-50-80 LG > HG 20-50-80 LG > HG 20-50-80 LG > SF 80 LG > SF 20-50-80 LG > SF 80 LG  that bilinguals strive to maintain phonetic contrasts between all of the vowels in their combined L1-L2 phonological space as adopted by the Speech-Learning Model (Flege, 1995) and the extended Perceptual Assimilation Model (Best and Tyler, 2007). The question arises whether the crosslinguistic differences we have identified are better explained by language-internal or language-external factors. It seems that neither the differences in duration nor the spectral differences can be attributed to language-internal factors. The longer duration of the long vowels in HG increases the ratio between tense and lax vowels, except for /u+/ and /U/. Hence, vowel contrasts are enhanced in the language with the smaller vowel inventory rather than in the two languages with the larger inventories. Also, longer durations of HG vowels are not restricted to regions of the vowel space in which cross-linguistic differences between the vowel inventories are found. The spectral differences do not seem to be attributable to differences in vowel inventory size either. Compared to the vowel spaces of SF and LG, the HG vowel space is expanded rather than contracted, as would be predicted by the Theory of Adaptive Dispersion (Liljencrants and Lindblom, 1972;Lindblom, 1986; see also Becker-Kristal, 2010). Moreover, differences in the F1 and F2 dimensions are not restricted to the area of the open-mid vowels, in which the vowel spaces of SF and LG are more crowded than that of HG. For the most part, previous studies examining an enlargement of the vowel space due to adaptive dispersion have compared the vowel spaces of monolingual speakers. However, the multilingual vowel space should be governed by the same organizational principles, i.e., sufficient perceptual contrast, as the monolingual vowel space (cf. Guion, 2003). Furthermore, effects of vowel space dispersion due to inventory size have mainly been observed in studies comparing languages with a greater difference in the size of the vowel inventories. Less often have studies found differences in vowel space size when the languages under consideration did not differ considerably in inventory size, which is the case for present-day SF (with mergers of long and short tense vowels), LG, and HG (see also Recasens and Espinosa, 2009).
Turning to the external factors, there are at least three possible sources of the deviant realization of HG vowels. First, according to the self-reports of our subjects, HG is usually acquired somewhat later than SF and LG and, with one exception (cf. Sec. II A), our subjects were overall less exposed to HG than to SF an LG before entering primary school. However, this difference suggests that HG should be closer to SF and LG rather than more distant, as later acquisition and less exposure diminishes the potential influence of speakers with HG as L1 and increases interferences between HG and both SF and LG (cf. Flege et al., 2003;Guion, 2003;Baker and Trofimovich, 2005;Haimes-Kusomoto, 2010).
Second, the distance between HG vowels and SF and LG vowels could be the result of dissimilation processes, which may be motivated by a perceived difference in the social status of the contact languages as well as their specific domains of use. As a national language, High German is not confined to the Saterland or the region of Northwest Germany but has a larger communication range. It is subject to official regulation and develops at a national level. Furthermore, HG is used in the education system and as an official language in more formal situations and therefore maybe pronounced more carefully, while the other languages are rather used informally. Whereas a more careful pronunciation is compatible with the lengthening of the long vowels and the expansion of the vowel space in the F2 dimension in HG, the overall higher realization of vowels in the F1 dimension suggests the influence of a different base-of-articulation rather than of speech mode. In fact, HG is more likely to diverge in vowel realization due to a different base-of-articulation than LG or SF. Due to the long-term contact of SF with LG in the former linguistic enclave of Saterland, which includes a substantial number of loan words in SF from LG, it is very likely that the phonetic systems of these languages began to converge centuries ago. Similarly,  found a high degree of convergence in the vowel productions in bilingual speakers of Welsh and English, which can likewise be attributed to long-term contact between the two languages.
The third possible reason is closely linked to the second. As mentioned in the Introduction, the speech communities of SF and LG largely overlap even though there may be some contact with speakers of LG outside the Saterland. Northern Standard German, on the other hand, is spoken by several million people in northern Germany, most of whom are monolinguals. The deviant vowel realization in HG may therefore also be a result of the fact that Saterland Frisians acquire HG in contact with speakers who belong to a speech community that hardly overlaps with the local speech communities of speakers of SF and LG. This larger speech community comprises both HG monolinguals in the Saterland TABLE VI. Cross-linguistic and cross-generation comparisons of vowel spaces filled by shared vowel categories between older (G1) and younger (G2) generation, where ">" indicates that the vowel space of the first language/generation is larger than the second.

20%
50% 80% HG > SF a HG > SF a HG > SF b HG > LG c HG > LG a HG > LG b G2 > G1 c a p < 0.01. b p < 0.001. c p < 0.05. and speakers of Northern Standard German outside the Saterland who are Standard German monolinguals or bilingual with Standard German and a dialect of Low German. The deviant realization of HG vowels by the trilingual speakers may therefore be interpreted as the result of a complete or incomplete assimilation to the vowel categories of Northern Standard German (cf. Grosjean, 1989). This explanation is borne out by a follow-up study by Schoormann et al. (2017b) examining the hypothesized orientation towards the wider speech community of Northern Standard German. The study compared the HG vowel productions of the trilinguals of the present study with vowel productions of monolinguals from Hanover, the capital of the state of Lower Saxony, whose variety of High German can be considered representative of the standard variety of the larger speech community of northern Germany. Schoormann et al. (2017b) provided evidence for an overall approximation of Saterland HG vowels to the vowels of Standard German as spoken by monolinguals outside the Saterland in terms of both durational and spectral features. In addition, the findings illustrate subphonemic influences of the local languages in the organization of the vowel system of the trilingual speakers.
The finding of differences in spectral features between SF and LG, mainly in the F1 dimension, may also be better explained by external than internal factors. A number of SF monophthongs were found to be realized higher than their LG counterparts. Considering that our subjects have merged SF short and long tense vowels and therefore do not have different inventories of monophthongs in SF and LG, vowel raising cannot be explained with reference to differences in the vowel systems. On the other hand, as the Saterland trilinguals share LG with a larger group of speakers of LG as L1 in the Saterland, it is not surprising that shared monophthong categories are realized differently in SF and LG.
The present study does more than provide data on a little-known minority language, Saterland Frisian. It also illustrates the possibilities offered by the study of trilingual speakers, especially when the L1 is in contact with an L2 and L3 that differ in social status or the autonomy of the speech community. Previous studies of vowel production in trilinguals have dealt with individual speakers rather than multilingual speech communities. Mayr and Montanari (2015), for example, investigated vowel productions of two trilingual sisters, aged 6;8 and 8;1. The sisters grew up with English, Italian, and Spanish. The language input was provided primarily by their father, mother, and nanny, respectively. Such scenarios of individual trilingualism provide the opportunity to study complex interactions between three rather than two languages and to relate asymmetries in the formation of distinct categories for the three languages to differences in the individual settings in which the children receive language-specific input. The Saterland, on the other hand, provides the opportunity to study vowel realization in a close-knit multilingual, or mixed speech community, where individuals with three different L1s communicate with each other in one of the shared languages. In contrast to most studies on individual early or simultaneous trilinguals and multilingual individuals who acquired an L2 and L3 as foreign languages at a later age, the particular linguistic situation in the Saterland is shaped by the fact that all three languages are indigenous to the region.
The results highlight the importance of considering language-external factors in the study of vowel productions by multilingual speakers beyond factors such as age of acquisition and language exposure. Particularly interesting is the question of whether the multilingual child receives linguistic input from speakers belonging to a local multilingual speech community, or to an autonomous monolingual speech community, as is the case in most available studies on vowel production in multilingual settings (e.g., Disner, 1983;Mack, 1989;Kehoe, 2002;Flege et al., 2003;Baker and Trofimovich, 2005;Grijalva et al., 2013;Yang et al., 2015). However, there is now a growing number of studies on bilingual subjects whose L1 is not shared with a larger monolingual speech community (e.g., Guion, 2003;Bullock and Gerfen, 2004;Haimes-Kusomoto, 2010;O'Rourke, 2010;Simonet, 2011;Mora et al., 2015). In this context, the study of speakers who are trilingual with languages differing in the autonomy of their speech communities, as is the case with the trilingual Saterland Frisians, is particularly promising.

ACKNOWLEDGMENTS
This research was supported by Research Grant PE 793/ 2-1 from the German Research Foundation (DFG). We are grateful to Darja Appelganz, Michaela Ballin, Romina Bergmann, Dorothee Lenartz, and Nicole Lommel for their assistance. We thank Simone Graetzer for proofreading and two anonymous reviewers and the associate editor for valuable comments.
1 That Saterland Frisian belongs to the Frisian language group is evident from a number of phonological features that can be traced back to Old East Frisian, such as rising of Germanic a to e in closed syllables (Old Fris. ekker and SF € Akker vs LG and HG Acker "farmland") (for an overview see Fort, 2004, p. 78). Low German and High German go back to different historical languages, Old Saxon and Old High German, respectively. The former differs from the latter due to the absence of the High German consonant shift and a number of ingvaeonic features that Old Saxon shares with Old Frisian and Old English (Krogh, 1996). Some scholars have questioned the status of current Low German as a separate language as it has no standard variety of its own (Goossens, 1983;Stellmacher, 2000). However, it is undisputed that current varieties of Low German and High German varieties can be traced back to different historical languages, and in this sense we speak of Saterland Low German and Saterland High German as varieties of different languages. 2 We assume a single national High German standard variety in Germany, as it is present in the national media. By "Saterland High German" we mean the High German variety of Saterland speakers who intend to speak Standard German. Whether Saterland High German matches the northern standard variety, especially in pronunciation, requires further research and is addressed by Schoormann et al. (2017a) (cf. Sec. IV). 3 Note that we use the term "speech community" to refer to a group of speakers who share a set of linguistic practices and linguistic knowledge but do not necessarily use a homogeneous language (see Patrick, 2002 for further discussion). 4 For a detailed account of the dialectal variation of SF, see Schoormann et al. (2017a). 5 Note that the following analysis and discussion only considers the shared monophthongs elicited in the monosyllabic /hVt/-context. 6 REPETITION was included as a third fixed effect to control for interactions between repetitions of the target vowel and the dependent variable (cf. Winter, 2011;Baayen, 2008, p. 270)