Finnish
Proto-Uralic to Finnish sound changes
Thus far mostly based on:
- Lauri Hakulinen: Suomen kielen rakenne ja kehitys • Otava 1979
- Kaisa Häkkinen: Nykysuomen etymologinen sanakirja • WSOY 2004
- Petri Kallio: Kantasuomen konsonanttihistoriaa · Mémoires de la Société Finno-Ougrienne 253 · 2007
- Juha Janhunen: The primary laryngeal in Uralic and beyond · (same)
Currently in process of reformatting and reordering to include the information from the last two documents.
Technotes
- Here, /@/ is NOT an ASCIIfication of /ə/, but any vowel that assimilates to the preceding vowel. This comes useful with cases of compensatory lengthening and ecko vowels.
- Similarly, /A O U/ are harmonic vowels which will assimilate to either /a o u/ or /æ ø y/ depending on the harmony. /a/ is to be understood as [ɑ].
- /ˣ/ is the assimilatory final, pronounced as lengthening of the next word's initial consonant, or in case of null initial, [ʔː] or hiatus. Very rarely, it occurs within words, too (usually sandwiched between two instances of the same vowel.)
- /C/ represents any consonant; /V/ represents any vowel; and /X/ represents any 2nd mora in a syllable (be it consonantal, difthongal or chronemical).
I've grouped similar changes together under sub-headings, so the order of unrelated changes might not be exactly chronological whenever no reference was available. Also, since the document is headed towards Standard Finnish, I've had to cut a few corners anyway when maneuvering around dialectal changes... in a few cases picking the most represented outcome wasn't all that clear.
Proto-Uralic to Pre-Finnic
[Ca. 4000 BCE to 3000 BCE]
PU roots generally had the form (C)V(C)C{A I}, with initial stress; in pronouns and prepositions also CV; and two lone-V roots, the negativ verb e- and the root "self", o-.
Unclear issues:
- the quality of the vowel *ë - [ɯ] or [ɤ]? (Substitution of Indo-Iranian *a by *ë speaks for the latter, unless these particular words are newer loans.)
- the quality of the vowel *a - [ɑ] or [ɒ]?
- was the 2nd-syllable vocalism phonemically /a i/ with fronted & backed allophones, or overtly vowel-harmonic /æ a i ɯ/?
- the quality of the consonant *x - [h], [x], [ɣ], something else?
The existence of "Proto-Finno-Samic" ("-Volgaic", "-Permic", "-Ugric") as distinct from PU is unclear, hence "Pre-Finnic". Changes shared with Samic are in indigo, those also shared with Mordvinic in green, and those with even wider distribution in orange.
Word-final */ŋ/ → k in the lativ ending, → n elsewhere (!dubious, pre-Uralic?)
Dephthongization (dubious)
- iw → y / _C (distribution? does this feed vowel length?)
- potentially: ow → uː / _C
Introduction of length from loss of preconsonantal *x.
- x → @ / _C (leaves no evidence in Ob-Ugric)
Coda nasal simplification
- n → ∅ / _(t)sʲ (distribution?)
- m → n / _{t tsʲ #} (Finnic, Mordvinic; medially also Permic, Mansi)
Stressed *ë merges with *a
Loss of /w/ before labial vowels (partially also in Mari)
- w → ∅ / #_{o u y}
The consonant may have persisted before long vowels, but since a glide was epenthetically later added there anyway, there's no way to tell.
Loss of */j/ before /i/ likely goes around here too, but Samic seems inconsistent (due to é → jé / #_ ?)
Other stressed vowel changes
- aː æː → oː eː (but part of a general a æ → oː eː / [+STR] shift in Samic)
Unstressed vowels
- a → æ / {æ e ê i ü}(X)(C)C_ (if not an original distinction)
- aw æw → o (the presence of -w rarely is shared, so this may also be analogical)
- i → e / _C (≠ j, w) (but part of a general i → ɤ shift in Samic)
- ij iw → i u ?
Pre-Finnic to Proto-Finnic
[Ca. 3000 BC to 2000 BC] (likely also incomplete; this is the section of changes not shared by other branches of Uralic)
Vowel changes
- V# → Vː (affects most CV words with the exception of me te se.)
- *ê *ô → e o / _(X)Ci (new hypothetical vowels for PU, possibly semi-rounded [ɪ ʊ])
→ y ɯ → y i / _(X)CA - æ → e / _j unstress'd
- a → e / {o u}[+STR](X)C_j
→ o / {a e i}[+STR](X)C_j
→ a / elsewhere
(Other instances of unstressed /aj/ shift too, but analogical leveling has rendered it impossible to tell whether the original result was /ej/ or /oj/.)
Loss of remaining /x/
- ixi → øː
- uxi → oː
- xi → @ / elsewhere
(*xA, *x# apparently did not occur)
Loss of /ŋ/
- UŋA → Oː
- ŋi → @ / {A i u}_
- ŋ → j / _Cʲ
→ remains _k
→ w / _U_ _O_, other _C
Loss of medial semivowels in i-stems
- Uwi → Uː
- ewi → øː
- wI → i (medial unstressed reduced /i/!)
- ji → @ / front V_
→ j / A_#, O_ U_
→ @ / A_{l r}(C)V (due to [je]?)
→ i / C_{# C) - /yje/ → */øː/ → yö
Initial deaffrication. Newer initial affricates are found in loanwords and onomatopoeia.
- ʧ ʦʲ → ʃ sʲ / #_
Depalatalization, commonly attributed to Germanic superstratum influence.
- ʦʲ(ː) sʲ ðʲ lʲ → ʦ(ː) s ð l
- nʲ → ni / #(C)V_V (i.e. before a short stressed vowel), → n (elsewhere)
Loss of /ð/
- ð → t (may be gradation-related, shared with Mordvinic but not Samic. Put here to avoid requiring postulating intermediate *tʲ for the development of *ðʲ)
---
This givs as the phonology of Proto-Finnic:
Consonant inventory lab dnt alv palv vel /m n / nasals /p t ts tʃ k/ plosivs/affricates / s ʃ / sibilants / l / lateral / r / rhotic /v j / semivowels
(I'm marking [ʋ] as /v/ for brevity from now on.)
Syllable structure (C)V(@, i, U, C)(C) /N/ didn't occur morpheme-initially. Morpheme-finally, only /t k s m n j/ occured. Word-initial /r/ was rare (non-existant in PU) /#ji #je #vu/ did not occur.
Allowed medial clusters included the following (and possibly more, if consonantal root forms were in existence yet by this stage):
- /pː pt tː tk kt tːs tʃk ktʃ kː/ (/tsk kts/?)
- /mp nt nts ntʃ ŋk/
- /ns nʃ/
- /ps ks kʃ kʃt/ (/kst/?)
- /tn km/ (only intermorphemically)
- /sm st sn sl sk ʃm ʃt ʃn ʃl ʃr ʃk/
- just about all approximant + non-approximant combinations
- /lj rj lv rv jv/(/vj/ is forbidden and metathesizes to /jv/ in loans)
- /ntː ŋkː rtː rkː lkː/?
- almost all allowed CC combinations preceded by Vj, VU or V@
Vowel inventory /i iː y yː u uː / /e eː øː o oː / /æ æː a aː / /ew æw aw ow / /ej æj aj oj uj/
/a: æ:/ were rare, only occuring in about half a dozen roots each. (These new instances are of fuzzy origin, apparently loanwords acquired between PU and PF?)
/i e A o u/ could occur in non-initial root syllables (plus /ej oj/ due to suffixal j).
Early Proto-Finnic to Late Proto-Finnic
[Ca. 2000 BCE to 1000 CE]
Loss of /ʧ/
- ʧ ʧː → t tʃ (In South Estonian, → ts / _k)
Difthong paradigm shift
j w → i U / V_{C #}
(not really phonetical; required for pre-difthongal consonants not to gradate)
Birth of consonantal suffixes i → ∅ / VC_, ks_ suffix-finally (with /ts tʃ/ counted as clusters, not phonemes)
Consonant gradation. These all occur on the general condition that the folloing syllable is closed.
- pː tː tsː kː → pˑ tˑ tsˑ kˑ → p t ts k / {sonorant}_V (the half-long stage ensures that gradated consonants can still themself trigger gradation; no gradation is found in Veps or Livonian)
- p t ts s k → b d s z ɡ / {sonorant}_V
- b d ɡ → β ð ɣ / except N_ (may be later - not evident in Votic)
(NB: gradation of modern /ht hk/ is analogy-borne)
Gradation-related changes (suffixal gradation needs elaboration)
- ɣ → j/v (a possible change involving a -ɣA suffix, found in kataja, jalava, kajava' etc.)
- ð ɣ → ∅ / V_V#
- p → U / _# (probably via [β]?)
Around this time there's also a paradigm shift wrt. /f/ in loanwords: the reflex of initial /f/ changes from /p/ to /v/. This could signify a change of [w] to [ʋ] in the position, but also of [ɸ] to [f] in the loaning languages! Medial /f/ does not seem to ever turn to /p/.
Vowel shifts
- oi → o / [-STR] (but reverted back in many, tho not all, cases where the -i was morphological)
- ai → ei / [+STR] (with many exceptions; also, surprizingly, /æi/ stays put)
- Vː → V / _i
Birth of consonantal root forms
- e → ∅ / stem-finally after a coronal
(This change could be much older and is actually more complex, but I don't kno what's the latest understanding)
Assibilation
- t → s / _i
except before a coronal obstruent (/t s ʃ/) or a derivational suffix
Esh-drift
- ʃ → ʂ → x (postdates old Baltic and Germanic loanwords)
Assimilation of many consonant clusters to geminates, etc. All of these require a morpheme boundary somewhere in the cluster. A basically equivalent criterion is requiring a preceding unstressed syllable. Of these, /rn pt kt kx tx/ (/kʃ tʃ/?) occurred root-medially, and the first three were retained (though rn → rː may have occurred in aarre; cf. aarni - and kt → tː is required for tytär, which appears to be the only loan with the cluster around this timeframe. South Estonian has even root-medially pt kt → tː.)
- kt(s) pt(s) → tː(s)
- xk → kː (happens also across word boundaries, precluding the formation of /?/)
- (t)(ː)sn → sː
- kx (tx) → xː
→ @x / _C (vaahtera, jäähty- ?) - rn ln → rː lː
- wst → st / o_ ? (nouse- ~ nosta-)
- pn tn kn ktn ptn (etc.) → nː
- pm tm km (etc.) → mː
- pst tst kst → st
(The consequent obscuring of many inflected forms due to this and the previous change caused many words to revert back, however. Note especially *pekstä, *pekse- → *peestä, peekse-)
Fricativ collapse, part 3
- ʦ → s / _{# s}_
- ʦ(ː) → θ(ː) (gradational)
- z → h
- x(ː) → h (a spirantic pronunciation can still be found in coda position)
V-epenthesis
- ∅ → ʋ / #_{yː øː oː} (Notable exceptions: yö uoma)
Shifts involving /h/ (unfinished)
- e → @ / h_ in suffixes
- p k → h / _t (With IE loanwords continuing to feed new /pt kt/, this rule remained activ up until to the 20th century.)
Late Proto-Finnic to Standard Finnish
[Ca. 1000-1900 CE] These changes are, for the most part, only attested in the Finnish-Carelian continuum.
"Flavor": Voiced prenasal stops become geminate nasals, and (around the same time as in a whole lot of other European languages!) long mid vowels become opening difthongs:
- mb nd ŋg → mː nː ŋː
- eː øː oː → ie yø uo
Changes involving /j/
- j → i / C_ suffix-initially
More shifts with /h/
- Vh → hV / {Vi n l r}_# in eg. vaihe venhe orhi urho alhainen ylhäinen (dialectally regular)
- k h → ˣ / _#
- s → h / _l (kihla pihlaja) (perhaps via *z)
- t → ∅ / h_r (ahrain ihra kehrä ohra) (cf. next)
Pre-sonorant stop vocalization (with an intermediate spirant stage)
Predominantly Germanic loanwords; a few Baltic, and a Uralic etymology exists for *kopra *kotva *kupla *nakris *syklä. By the evidence of other Finnic languages, *Tl in loanwords is initially substituted by *kl (eg. *seeθla → *seekla).
- p → U / _S (hauras kauris koura seura taulu teuras vauras äyräs; also note kupla, from a conservativ dialect)
- t remains _{v, j} (katve ketju kotva latva lotja patja patvi vitja)
→ U / _r{A, O} (aura nöyrä peura puuro uuras)
→ @ / _r{i, e} (teeri)
(any coda examples before i O??) - k → @ / _j (laaja raaja taaja vaaja)
→ i / {i, e}_S{i, e} (eilen keila leili leiri neilikka peili teili teini tiili) (May have rather occurred in loaning Finland Swedish dialects, except eilen, of unkno'n origin & where Karelian explicitly retains /kl/.)
→ U / {A, O, U}_S (S≠j) (hauli kaula kaura käyrä myyrä mäyrä naula nauris naura- paula vaula väylä vaunu syylä taula uuni); {i e}_Sa (neula seula siula siuna-)
Spirant loss
- β → ∅ / _UC
→ v / other _V - ið → j / V[-STR]_V
- ð remains V[+STR](X)_
→ l / l_
→ r / r_
→ ∅ / elsewhere - ɣ → j / C_e
→ v / U_U
→ ? / V1V2_V2 (including the cases of V1=V2; also V2≠U)
→ ∅ / elsewhere - h → ∅ / V[-STR](X)_V
Subsequent vowel changes in unstressed syllables (unfinished, may need to be meshed with the prev. section)
- AO → Aː, Oː or Uː (seemingly irregularly)
- Ae → Ai
- Ue → eː
- VU → Vː / _#
- iU → Uː
- OU → Oː (kokoontu-; but aitous etc.)
Initial-syllable labialization
- ey → øy
- e i ie → ø y yø | _(X)(C)Cy (if the /y/ is a part of the root)
- i → y / _væ (this one is actually older than the others, but fits here better)
The final stages of interdental loss began after or around the time of the creation of the literary language, seen in spellings such as <tz dh>. By standardization it was however practically complete. The standard outcome is largely a spelling pronunciation based on the example of German and Swedish:
- θ(ː) → ts
- ð → d (commonly alveolar)
Most common dialectal variations for the former are t(ː) and ht~t, for the latter r and ∅.