The Monstrous Magyaroid All-ASCII Digraph Device: Difference between revisions
m (→Taking it as far as you can: Typo in previous edit) |
m (→The oh so smart Hungarian digraphs: "parsimoniousness" --> "parsimony") |
||
(3 intermediate revisions by the same user not shown) | |||
Line 50: | Line 50: | ||
The ''z'' in ''cz'' has been dropped in modern Hungarian orthography, so that the voiceless alveolar affricate is now ''c''. | The ''z'' in ''cz'' has been dropped in modern Hungarian orthography, so that the voiceless alveolar affricate is now ''c''. | ||
This is all the more amazing as it did '''not''', probably, originate with someone with an unusual phonological acumen for the middle ages getting a great idea, but by sheer accident: it just so happened that Old High German, the language of those the Hungarians learned the Latin alphabet from, differentiated between an apical voiceless alveolar sibilant /s̺/, spelled ''s'' and a laminal voiceless alveolar sibilant /s̻/ spelled ''zz''. Old High German did not yet have any [ʃ] sound, but of the two OHG sounds the apical sibilant ''s'' was perceived as being most similar to Hungarian /ʃ/ — probably because [ʃ] is usually apical —, and the laminal sibilant ''zz'' was perceived as being more similar to Hungarian /s/ — probably because the latter was laminal. Thus ''s'' became the preferred spelling for Hungarian /ʃ/ and /ʒ/, and ''z'' became the preferred spelling for Hungarian /s/ and /z/. Voiced and voiceless sibilants weren't distinguished, probably because OHG didn't have any voiced sibilants. (Incidentally the Hungarians downgraded the written representation of their language, since they already had [[WP:Old Hungarian script|a phonologically adequate writing system]], which however was not regarded as proper for writing Christian texts on parchment.) By the time Hungarian writers felt a need to differentiate voiced and voiceless sibilants in a consistent manner they chose to use ''sz'' for /s/, probably again because of German influence: in the meantime ''sz'' (or rather ''ſz'', usually written as a ligature) had become the preferred spelling for the sound written ''zz'' in OHG. So ''z'' remained the spelling for /z/ by default. The 'choice' of ''s'' for /ʃ/ was probably simply because that was the usual value of that letter, since /ʃ/ is by far the most frequent of the Hungarian sibilants, while /ʒ/ is the least frequent. The choice of the spelling ''zs'' for /ʒ/ was thus probably by default: they already had the spellings ''sz, z, s'' assigned to three of the sibilants for the reasons stated, and so there was probably not a lot of reasoning behind the choice of ''zs'' for the fourth sibilant, just analogy and | This is all the more amazing as it did '''not''', probably, originate with someone with an unusual phonological acumen for the middle ages getting a great idea, but by sheer accident: it just so happened that Old High German, the language of those the Hungarians learned the Latin alphabet from, differentiated between an apical voiceless alveolar sibilant /s̺/, spelled ''s'' and a laminal voiceless alveolar sibilant /s̻/ spelled ''zz''. Old High German did not yet have any [ʃ] sound, but of the two OHG sounds the apical sibilant ''s'' was perceived as being most similar to Hungarian /ʃ/ — probably because [ʃ] is usually apical —, and the laminal sibilant ''zz'' was perceived as being more similar to Hungarian /s/ — probably because the latter was laminal. Thus ''s'' became the preferred spelling for Hungarian /ʃ/ and /ʒ/, and ''z'' became the preferred spelling for Hungarian /s/ and /z/. Voiced and voiceless sibilants weren't distinguished, probably because OHG didn't have any voiced sibilants. (Incidentally the Hungarians downgraded the written representation of their language, since they already had [[WP:Old Hungarian script|a phonologically adequate writing system]], which however was not regarded as proper for writing Christian texts on parchment.) By the time Hungarian writers felt a need to differentiate voiced and voiceless sibilants in a consistent manner they chose to use ''sz'' for /s/, probably again because of German influence: in the meantime ''sz'' (or rather ''ſz'', usually written as a ligature) had become the preferred spelling for the sound written ''zz'' in OHG. So ''z'' remained the spelling for /z/ by default. The 'choice' of ''s'' for /ʃ/ was probably simply because that was the usual value of that letter, since /ʃ/ is by far the most frequent of the Hungarian sibilants, while /ʒ/ is the least frequent. The choice of the spelling ''zs'' for /ʒ/ was thus probably by default: they already had the spellings ''sz, z, s'' assigned to three of the sibilants for the reasons stated, and so there was probably not a lot of reasoning behind the choice of ''zs'' for the fourth sibilant, just analogy and parsimony. | ||
== Taking it as far as you can == | == Taking it as far as you can == | ||
Line 326: | Line 326: | ||
== Disabiguating == | == Disabiguating == | ||
Clearly this scheme needs some means to disambiguate e.g. [lz] from [l̪]. There is a really simple solution: wherever two adjacent letters which could be a digraph belong to different graphies you put a period between them: ''lz'' is [l̪], but [lz] is ''l.z''; ''kh'' is /ʔ/, but [kʰ] is ''k.h''. This works because the punctuation character ''.'' (the period/full stop) is usually followed by whitespace, another punctuation character or the end-of-text; a period between two letters is then a pretty safe digraph-breaker! Moreover you can, when you are not ''really'' restricted to ASCII, use the mid dot (''·'' U+00B7 or decimal 183, thus in Latin-1 and usable even on Yahoo groups...) instead of the low period, as Catalan does to distinguish ''ll'' [ʎ] from ''l·l'' [ll]. | Clearly this scheme needs some means to disambiguate e.g. [lz] from [l̪]. There is a really simple solution: wherever two adjacent letters which could be a digraph belong to different graphies you put a period/full stop between them: ''lz'' is [l̪], but [lz] is ''l.z''; ''kh'' is /ʔ/, but [kʰ] is ''k.h''. This works because the punctuation character ''.'' (the period/full stop) is usually followed by whitespace, another punctuation character or the end-of-text; a period between two letters is then a pretty safe digraph-breaker! Moreover you can, when you are not ''really'' restricted to ASCII, use the mid dot (''·'' U+00B7 or decimal 183, thus in Latin-1 and usable even on Yahoo groups... [Yeah, the first version of this page was written '''that''' long ago! :-)]) instead of the low period, as Catalan does to distinguish ''ll'' [ʎ] from ''l·l'' [ll]. | ||
== Secondary articulations == | == Secondary articulations == | ||
Line 345: | Line 345: | ||
<!-- Maybe it's better to define apostrophe after a vowel as a stress mark? --> | <!-- Maybe it's better to define apostrophe after a vowel as a stress mark? --> | ||
I'm inclined to believe that the difference, if any, between a hiatus and a diphthong always has to do with the relation between prosody and syllabification of a particular language. If there really is a need to distinguish hiatus and stress you can use ''`'' (backtick/grave) after the vowel or syllable ''ka`k, kak` '', replaced with letters with grave or acute accents ''à, á'' when available, for stress, increasing the number of backticks increased | I'm inclined to believe that the difference, if any, between a hiatus and a diphthong always has to do with the relation between prosody and syllabification of a particular language. If there really is a need to distinguish hiatus and stress you can use ''`'' (backtick/grave) after the vowel or syllable ''ka`k, kak` '', replaced with letters with grave or acute accents ''à, á'' when available, for stress, increasing the number of backticks for increased levels of stress. | ||
=====Abusing Accents===== | =====Abusing Accents===== | ||
Line 387: | Line 387: | ||
====Apostrophes and quotes==== | ====Apostrophes and quotes==== | ||
Actual apostrophes may be preceded with a dot or doubled. You may want to use single and double | Actual apostrophes may be preceded with a dot or doubled. You may want to use single ''"'' and double ''""'' double quotes to indicate two levels of quotes. | ||
== Length indication == | == Length indication == |
Latest revision as of 00:21, 10 March 2022
The oh so smart Hungarian digraphs
I'm a great fan of the Hungarian digraphs where the first letter indicates manner of articulation and the second place of articulation, and have been known to play with such schemes, (or the reverse, POA+MOA):
Graphy | Manner | Place | IPA |
---|---|---|---|
z | voiced sibilant | alveolar | [z] |
s | voiceless sibilant | postalveolar | [ʃ] |
sz | voiceless sibilant | alveolar | [s] |
zs | voiced sibilant | postalveolar | [ʒ] |
c(z) | voiceless affricate | alveolar | [ts] |
cs | voiceless affricate | postalveolar | [tʃ] |
The z in cz has been dropped in modern Hungarian orthography, so that the voiceless alveolar affricate is now c.
This is all the more amazing as it did not, probably, originate with someone with an unusual phonological acumen for the middle ages getting a great idea, but by sheer accident: it just so happened that Old High German, the language of those the Hungarians learned the Latin alphabet from, differentiated between an apical voiceless alveolar sibilant /s̺/, spelled s and a laminal voiceless alveolar sibilant /s̻/ spelled zz. Old High German did not yet have any [ʃ] sound, but of the two OHG sounds the apical sibilant s was perceived as being most similar to Hungarian /ʃ/ — probably because [ʃ] is usually apical —, and the laminal sibilant zz was perceived as being more similar to Hungarian /s/ — probably because the latter was laminal. Thus s became the preferred spelling for Hungarian /ʃ/ and /ʒ/, and z became the preferred spelling for Hungarian /s/ and /z/. Voiced and voiceless sibilants weren't distinguished, probably because OHG didn't have any voiced sibilants. (Incidentally the Hungarians downgraded the written representation of their language, since they already had a phonologically adequate writing system, which however was not regarded as proper for writing Christian texts on parchment.) By the time Hungarian writers felt a need to differentiate voiced and voiceless sibilants in a consistent manner they chose to use sz for /s/, probably again because of German influence: in the meantime sz (or rather ſz, usually written as a ligature) had become the preferred spelling for the sound written zz in OHG. So z remained the spelling for /z/ by default. The 'choice' of s for /ʃ/ was probably simply because that was the usual value of that letter, since /ʃ/ is by far the most frequent of the Hungarian sibilants, while /ʒ/ is the least frequent. The choice of the spelling zs for /ʒ/ was thus probably by default: they already had the spellings sz, z, s assigned to three of the sibilants for the reasons stated, and so there was probably not a lot of reasoning behind the choice of zs for the fourth sibilant, just analogy and parsimony.
Taking it as far as you can
One obvious 'defect' of Hungarian spelling is that voiced affricates are written just as dz, dzs. So my first obvious 'improvement' was to assign x to [dʒ] and xz to [dz], or the reverse x = [dz], xs = [dʒ] (x = [dz] incidentally agreeing with the Albanian mapping for x)!
But I got more ambitious than that (incidentally dropping the x-for-voiced affricate mapping in the process):
Consonants
Voiceless fricative |
Voiced fricative |
Voiceless stop |
Voiced stop |
Voiceless affricate |
Voiced affricate |
Nasal (voiced) |
Lateral (voiced) |
Tap / flap (voiced) |
Trill (voiced) |
Approximant (voiced) |
|
---|---|---|---|---|---|---|---|---|---|---|---|
Bilabial | fw | bw | p | b | pfw | bfw | mw | rw | w | ||
Labiodental | f | fv | pv | bv | pf | bf | m | xv | v | ||
Dental | sz | z | td | d | cz | jz | nz | lz | rz | ||
Apicoalveolar | s | zs | t | dt | cs | js | nr | lr | xr xlr | rr | r |
Laminoalveolar | sc | zc | tc | dc | c | jc | n | l | rl | ||
Palatoalveolar | sj | zj | tj | dj | cj | j | nj | lj | rj | ||
Alveopalatal | sy | zy | ty | dy | cy | jy | ny | ly | ry | ||
Retroflex | sx | zx | tx | dx | cx | jx | nx | lx | x xl | rx | |
Palatal | hy | qy | ky | gy | khy | gqy | ngy | lgy | y | ||
Velar | hk | qg | k | g | khk | gqg | ng | lg | yg | ||
Uvular | hq | q | kq | gq | khq | gqq | nq | lq | rq | yq | |
Pharyngeal | hh | yh | |||||||||
Glottal | h | qh | kh |
Vowels
It's a lot trickier to do a similar scheme with only five vowel letters a, e, i, o, u (since I already had to use y, w as consonants...). However, the five single vowels + all possible combinations of two of them gives 52 = 25 different vowel graphies, which should be enough qualitative distinctions for most languages. So I had a go at distributing those graphies over the vowel space, pretending that the holes in the IPA official vowel chart are really justified, and trying to give the digraphs sensible 'intermediate' values between the single-letter vowels. It's not entirely consistent: I let e and ea swap places so that [ə] would get a single-letter graphy, then based the values of eu and ae on the [ə] value for e...
Front unrounded |
Front rounded |
Central unrounded |
Central rounded |
Back unrounded |
Back rounded |
|
---|---|---|---|---|---|---|
High | i | ui | ia | ua | iu | u |
Lower high | ie | ue | uo | |||
High mid | ei | oi | e | eu | io | ou |
Low mid | ea | oe | ae | oa | eo | o |
Low | ai [æ] | a [a] | ao | au |
One will probably have to tweak values to fit specific languages, but that shauld be OK as long as one picks the nearest suitable graphy.
That may of course be done to cut down on digraphs with consonants also!
Disabiguating
Clearly this scheme needs some means to disambiguate e.g. [lz] from [l̪]. There is a really simple solution: wherever two adjacent letters which could be a digraph belong to different graphies you put a period/full stop between them: lz is [l̪], but [lz] is l.z; kh is /ʔ/, but [kʰ] is k.h. This works because the punctuation character . (the period/full stop) is usually followed by whitespace, another punctuation character or the end-of-text; a period between two letters is then a pretty safe digraph-breaker! Moreover you can, when you are not really restricted to ASCII, use the mid dot (· U+00B7 or decimal 183, thus in Latin-1 and usable even on Yahoo groups... [Yeah, the first version of this page was written that long ago! :-)]) instead of the low period, as Catalan does to distinguish ll [ʎ] from l·l [ll].
Secondary articulations
This also comes in handy to provide a means of symbolizing secondary articulations: you can e.g. use an for [ã] but a.n (or a·n) for [an].
Similarly lateral fricatives may be written by putting an l after the graphy for a fricative: sl [ɬ], ˈˈs.l/s·lˈˈ [sl], zsl [ɮ], zs.l/zs·l [zl].
Syllabification, diphthongs and hiatus — and quotes
Unfortunately you also need to distinguish not only diphthongs from vowel digraphs, but also diphthongs from vowels in hiatus. A reasonable solution is to use an apostrophe for hiatus: hi'atus is the Latin pronunciation of "hiatus", hi.atus (or hi·atus) is [hi͡atus] with a diphthong, and hiatus is [hɨtus]!
This can come in handy with consonants too: sy is [ɕ], s.y (or s·y) is [sʲ] and s'y is [sj] — if one really needs to distinguish all three.
A good thing I refrained from using the apostrophe as a letter! ;-)
Syllabification and stress
I'm inclined to believe that the difference, if any, between a hiatus and a diphthong always has to do with the relation between prosody and syllabification of a particular language. If there really is a need to distinguish hiatus and stress you can use ` (backtick/grave) after the vowel or syllable ka`k, kak` , replaced with letters with grave or acute accents à, á when available, for stress, increasing the number of backticks for increased levels of stress.
Abusing Accents
Alternatively you can use my scheme for indicating length and stress with acute and grave accent marks. It employs the three most common accent marks, the acute ( ˊ ), the grave ( ˋ ) and the circumflex ( ˆ ) according to the following pattern (exemplified on the letter a):
Length | |||
Short | Long | ||
Stress | Unstressed | a | á |
Stressed | à | â |
The impetus for the system comes from the fact that the circumflex graphically looks like a combination of the acute and the grave:
/ | + | \ | = | /\ |
When really restricted to ASCII you can use the ASCII apostrophe, backtick and circumflex after the vowel. You may then use a colon : as hiatus mark and a double colon :: for an actual colon.
Tone and pitch
For tone I recommend using Tone_numbers, either arbitrary language specific ones or Yuen Ren Chao-style ones where numbers 1—5 indicate relative pitch levels, contours are indicated with juxtaposed digits, and lack of tone is indicated with 0. The numbers should be written after the syllable or word with no intervening space or punctuation, superscript or subscript (to distinguish from footnote markers). If you feel the need to distinguish actual numbers use the hiatus marker before the latter.
Apostrophes and quotes
Actual apostrophes may be preceded with a dot or doubled. You may want to use single " and double "" double quotes to indicate two levels of quotes.
Length indication
This is simple: just double the first letter of the digraph: seei is "se" [seː], the Swedish word for 'see', akkya is [acca], and addta is [ad̺d̺a].
Voiceless sonorants
Voiceless sonorants are written by putting an h before the graphy for the voiced counterpart: hw, hv, hl, hr, hn, hng etc.
It's probably a good idea to use hl, hlz etc. for voiceless lateral fricatives, as voiceless lateral approximants probably never are phonemic in the wild.
Also I decided to use hy rather than hky for [ç], as a distinction between voiceless palatal fricative and voiceless palatal approximant probably doesn't occur in the wild. The trigraph fricative spelling is of course there if you really need it.
Assume homorganicness
Nasals and maybe also laterals may probably be assumed to be homorganic with a following obstruent unless otherwise indicated. Thus nt, n.j, nty, n.g etc. may safely be used instead of nrt, njj, nyty, ngg etc. (although your preferences may vary as to which of njj/n.j_ and ngg/n.g is better!) To mark a nasal as explicitly laminoalveolar one can use nl. Since the difference between [nˡ] and [lⁿ] is probably philosophical, ln can be used for a nasal lateral/lateral nasal alike.
That's all for now... --BPJ 11:18, 23 September 2011 (PDT)