Natlang Uses of Diacritics in the Latin Alphabet: Difference between revisions

From FrathWiki
Jump to navigationJump to search
(Examples from Estonian added)
(Information about the origin of caron and cedilla were added)
Line 1: Line 1:
{{WIP}}
{{WIP}}
This page will list different uses of diacritical marks that have natlang precedence. Conlangers can use this to find inspiration for their own conlang romanizations.
This page will list different uses of diacritical marks that have natlang precedence. Conlangers can use this to find inspiration for their own conlang romanizations.
:Note that in this article combining diacritics are attached to a ◌. Diacritics without a ◌, like ¨ for example, are non-combining. Non-combining diacritics are sometimes called modifier letters in Unicode.
:Note that in this article combining diacritics are attached to a ◌. Diacritics without a ◌, like ¨ for example, are non-combining. Non-combining diacritics are sometimes called modifier letters in Unicode. When a letter is referred to without concerning about case, it is displayed like so: Ťť. This is for clarity's sake because some diacritics may look different depending on the letter's case.


== Caron ==
== Caron ==
Line 36: Line 36:
| Latin Small Letter U With Diaeresis And Caron || Latin Capital Letter Z With Caron || Latin Small Letter Z With Caron || Latin Capital Letter Ezh With Caron || Latin Small Letter Ezh With Caron
| Latin Small Letter U With Diaeresis And Caron || Latin Capital Letter Z With Caron || Latin Small Letter Z With Caron || Latin Capital Letter Ezh With Caron || Latin Small Letter Ezh With Caron
|}
|}
Caron is also known as háček or haček. Note that the caron is easily confused with the similar looking Breve ˘, especially in small font sizes.
Caron is also known as háček or haček. It originated from [[Natlang_Uses_of_Diacritics_in_the_Latin_Alphabet#Dot_Above|dot above]] in Czech orthography.[http://en.wikipedia.org/wiki/Caron#Origin] Note that the caron is easily confused with the similar looking Breve ˘, especially in small font sizes.


{| class="wikitable"
{| class="wikitable"
Line 80: Line 80:
| '''Note:''' May be confused with Latin Capital Letter T With Comma Below Ț (U+021A). || '''Note:''' May be confused with Latin Small Letter T With Comma Below ț (U+021B).
| '''Note:''' May be confused with Latin Capital Letter T With Comma Below Ț (U+021A). || '''Note:''' May be confused with Latin Small Letter T With Comma Below ț (U+021B).
|}
|}
Note that the cedilla may be confused with Ogonek ˛ or Comma Below ◌̦. In some fonts, the cedilla together with some letters may look identical to the comma. In Romanian, the letters Șș and Țț are actually supposed to have a comma below and not a cedilla, while in most other languages Şş and Ţţ are supposed to have cedillas.
Cedilla originates from a cursive form of Z.[http://en.wikipedia.org/wiki/Cedilla] Note that the cedilla may be confused with Ogonek ˛ or Comma Below ◌̦. In some fonts, the cedilla together with some letters may look identical to the comma. In Romanian, the letters Șș and Țț are actually supposed to have a comma below and not a cedilla, while in most other languages Şş and Ţţ are supposed to have cedillas.


{| class="wikitable"
{| class="wikitable"

Revision as of 01:49, 16 August 2012

This page will list different uses of diacritical marks that have natlang precedence. Conlangers can use this to find inspiration for their own conlang romanizations.

Note that in this article combining diacritics are attached to a ◌. Diacritics without a ◌, like ¨ for example, are non-combining. Non-combining diacritics are sometimes called modifier letters in Unicode. When a letter is referred to without concerning about case, it is displayed like so: Ťť. This is for clarity's sake because some diacritics may look different depending on the letter's case.

Caron

Precomposed Letters with Caron
ˇ ◌̌ Ǎ ǎ Č č Ď ď DŽ Dž dž Ě ě
U+02C7 U+030C U+01CD U+01CD U+010C U+010D U+010E U+010F U+01C4 U+01C5 U+01C6 U+011A U+011B
Caron Combining Caron Latin Letter Capital A With Caron Latin Letter Small A With Caron Latin Capital Letter C With Caron Latin Small Letter C With Caron Latin Capital Letter D With Caron Latin Small Letter D With Caron Latin Capital Letter Dz With Caron Latin Capital Letter D With Small Letter Z With Caron Latin Small Letter Dz With Caron Latin Capital Letter E With Caron Latin Small Letter E With Caron
Note: May be confused with Modifier Letter Down Arrowhead ˅ (U+02C5). Note: The caron looks actually like an apostrophe placed to the right of the ascender of the d.
Ǧ ǧ Ȟ ȟ Ǐ ǐ ǰ Ǩ ǩ Ľ ľ Ň ň
U+01E6 U+01E7 U+021E U+021F U+01CF U+01D0 U+01F0 ​ U+01E8 U+01E9 U+013D U+013E U+0147 U+0148
Latin Capital Letter G With Caron Latin Small Letter G With Caron Latin Capital Letter H With Caron Latin Small Letter H With Caron Latin Capital Letter I With Caron Latin Small Letter I With Caron Latin Small Letter J With Caron Latin Capital Letter K With Caron Latin Small Letter K With Caron Latin Capital Letter L With Caron Latin Small Letter L With Caron Latin Capital Letter N With Caron Latin Small Letter N With Caron
Note: The caron looks actually like an apostrophe placed to the right of the ascender of the Ll.
Ǒ ǒ Ř ř Š š Ť ť Ǔ ǔ Ǚ
U+01D1 U+01D2 U+0158 U+0159 U+0160 U+0161 U+1E66 U+1E67 U+0164 U+0165 U+01D3 U+01D4 U+01D9
Latin Capital Letter O With Caron Latin Small Letter O With Caron Latin Capital Letter R With Caron Latin Small Letter R With Caron Latin Capital Letter S With Caron Latin Small Letter S With Caron Latin Capital Letter S With Caron And Dot Above Latin Small Letter S With Caron And Dot Above Latin Capital Letter T With Caron Latin Small Letter T With Caron Latin Capital Letter U With Caron Latin Small Letter U With Caron Latin Capital Letter U With Diaeresis And Caron
Note: The caron looks actually like an apostrophe placed to the right of the ascender of the t.
ǚ Ž ž Ǯ ǯ
U+01DA U+017D U+017E U+01EE U+01EF
Latin Small Letter U With Diaeresis And Caron Latin Capital Letter Z With Caron Latin Small Letter Z With Caron Latin Capital Letter Ezh With Caron Latin Small Letter Ezh With Caron

Caron is also known as háček or haček. It originated from dot above in Czech orthography.[1] Note that the caron is easily confused with the similar looking Breve ˘, especially in small font sizes.

Uses of Caron
Usage Language Letters Notes
Postalveolar consonant Latgalian, Latvian Čč /tʃ/, Šš /ʃ/, Žž /ʒ/ Unaccented Cc stands for /ts/ in Latvian and Latgalian.
Livonian Šš /ʃ/, Žž /ʒ/

Cedilla

Precomposed Letters with Cedilla
¸ ◌̧ Ç ç Ȩ ȩ Ģ
U+00B8 U+0327 U+00C7 U+00E7 U+1E08 U+1E09 U+1E10 U+1E11 U+0228 U+0229 U+1E1C U+1E1D U+0122
Cedilla Combining Cedilla Latin Capital Letter C With Cedilla Latin Small Letter C With Cedilla Latin Capital Letter C With Cedilla And Acute Latin Small Letter C With Cedilla And Acute Latin Capital Letter D With Cedilla Latin Small Letter D With Cedilla Latin Capital Letter E With Cedilla Latin Small Letter E With Cedilla Latin Capital Letter E With Cedilla And Breve Latin Small Letter E With Cedilla And Breve Latin Capital Letter G With Cedilla
ģ Ķ ķ Ļ ļ Ņ ņ Ŗ ŗ Ş ş
U+0123 U+1E28 U+1E29 U+0136 U+0137 U+013B U+013C U+0145 U+0146 U+0156 U+0157 U+015E U+015F
Latin Small Letter G With Cedilla Latin Capital Letter H With Cedilla Latin Small Letter H With Cedilla Latin Capital Letter K With Cedilla Latin Small Letter K With Cedilla Latin Capital Letter L With Cedilla Latin Small Letter L With Cedilla Latin Capital Letter N With Cedilla Latin Small Letter N With Cedilla Latin Capital Letter R With Cedilla Latin Small Letter R With Cedilla Latin Capital Letter S With Cedilla Latin Small Letter S With Cedilla
Note: The diacritic is placed on top of the letter to avoid the descender of the g. Note: May be confused with Latin Capital Letter S With Comma Below Ș (U+0218). Note: May be confused with Latin Small Letter S With Comma Below ș (U+0219).
Ţ ţ
U+0162 U+0163
Latin Capital Letter T With Cedilla Latin Small Letter T With Cedilla
Note: May be confused with Latin Capital Letter T With Comma Below Ț (U+021A). Note: May be confused with Latin Small Letter T With Comma Below ț (U+021B).

Cedilla originates from a cursive form of Z.[2] Note that the cedilla may be confused with Ogonek ˛ or Comma Below ◌̦. In some fonts, the cedilla together with some letters may look identical to the comma. In Romanian, the letters Șș and Țț are actually supposed to have a comma below and not a cedilla, while in most other languages Şş and Ţţ are supposed to have cedillas.

Uses of Cedilla
Usage Language Letters Notes
Palatal consonant Latgalian, Latvian Ģģ /ɟ/, Ķķ /c/, Ļļ /ʎ/, Ņņ /ɲ/
Livonian Ḑḑ /ɟ/, Ļļ /ʎ/, Ņņ /ɲ/, Ţţ /c/
Palatalized consonant Livonian Ŗŗ /rʲ/

Diaeresis/Umlaut

Precomposed Letters with Diaeresis/Umlaut
¨ ◌̈ Ä ä Ǟ ǟ Ë ë Ï ï
U+00A8 U+0308 U+00C4 U+00E4 U+01DE U+01DF U+00CB U+00EB U+1E26 U+1E27 U+00CF ​ U+00EF U+1E2E
Diaeresis Combining Diaeresis Latin Capital Letter A With Diaeresis Latin Small Letter A With Diaeresis Latin Capital Letter A With Diaeresis And Macron Latin Small Letter A With Diaeresis And Macron Latin Capital Letter E With Diaeresis Latin Small Letter E With Diaeresis Latin Capital Letter H With Diaeresis Latin Small Letter H With Diaeresis Latin Capital Letter I With Diaeresis Latin Small Letter I With Diaeresis Latin Capital Letter I With Diaeresis And Acute
Ö ö Ȫ ȫ Ü ü Ǖ ǖ Ǘ
U+1E2F U+00D6 U+00F6 ​ U+022A U+022B U+1E4E U+1E4F U+1E97 U+00DC ​ U+00FC U+01D5 U+01D6 U+01D7
Latin Small Letter I With Diaeresis And Acute ​ Latin Capital Letter O With Diaeresis Latin Small Letter O With Diaeresis ​ Latin Capital Letter O With Diaeresis And Macron Latin Small Letter O With Diaeresis And Macron Latin Capital Letter O With Tilde And Diaeresis Latin Small Letter O With Tilde And Diaeresis Latin Small Letter T With Diaeresis Latin Capital Letter U With Diaeresis Latin Small Letter U With Diaeresis Latin Capital Letter U With Diaeresis And Macron Latin Small Letter U With Diaeresis And Macron ​ Latin Capital Letter U With Diaeresis And Acute
ǘ Ǚ ǚ Ǜ ǜ Ÿ ÿ
U+01D8 U+01D9 U+01DA U+01DB U+01DC U+1E7A U+1E7B U+1E84 U+1E85 U+1E8C U+1E8D U+0178 U+00FF
Latin Small Letter U With Diaeresis And Acute Latin Capital Letter U With Diaeresis And Caron Latin Small Letter U With Diaeresis And Caron Latin Capital Letter U With Diaeresis And Grave Latin Small Letter U With Diaeresis And Grave Latin Capital Letter U With Macron And Diaeresis Latin Small Letter U With Macron And Diaeresis Latin Capital Letter W With Diaeresis Latin Small Letter W With Diaeresis Latin Capital Letter X With Diaeresis Latin Small Letter X With Diaeresis Latin Capital Letter Y With Diaeresis Latin Small Letter Y With Diaeresis

Diaeresis (known as tréma in French) and umlaut both employ the same character. But there is a difference of use between diaeresis and umlaut. Letters with umlaut stand for completely different sounds than their non-accented counterparts. For example in Swedish <o> represents /u/ while <ö> represents /ø/. Diaeresis on the other hand does not change the sound value of a letter, but instead marks that a vowel is not part of a diphthong or digraph.

Uses of Diaeresis or Umlaut
Usage Language Letters Notes
Front version of back vowel Estonian Ää /æ/, Öö /ø/, Üü /y/
Finnish Ää /æ/, Öö /ø/ Usage borrowed from Swedish.
Livonian Ää /æ/, Ǟǟ /æː/
Swedish Ää /ɛ/, Öö /ø/ The umlaut evolved from the letter e in the digraphs ae[3] and oe[4].
Syllable break. When two vowel follow each other, a diaeresis on the second vowel indicates that the vowels are in two different syllables instead of forming a diphthong. French Ëë, Ïï, Üü, Ÿÿ

Dot Above

Precomposed Letters with Dot Above
˙ ◌̇ Ȧ ȧ Ǡ ǡ Ċ ċ Ė
U+02D9 U+0307 U+0226 U+0227 U+01E0 U+01E1 U+1E02 U+1E03 U+010A U+010B U+1E0A U+1E0B U+0116
Dot Above Combining Dot Above Latin Capital Letter A With Dot Above Latin Small Letter A With Dot Above Latin Capital Letter A With Dot Above And Macron Latin Small Letter A With Dot Above And Macron Latin Capital Letter B With Dot Above Latin Small Letter B With Dot Above Latin Capital Letter C With Dot Above Latin Small Letter C With Dot Above Latin Capital Letter D With Dot Above Latin Small Letter D With Dot Above Latin Capital Letter E With Dot Above
ė Ġ ġ İ i
U+0117 U+1E1E U+1E1F U+0120 U+0121 U+1E22 U+1E23 U+0130 U+0069 ​ U+1E40 U+1E41 U+1E44 U+1E45
Latin Small Letter E With Dot Above Latin Capital Letter F With Dot Above Latin Small Letter F With Dot Above Latin Capital Letter G With Dot Above Latin Small Letter G With Dot Above Latin Capital Letter H With Dot Above Latin Small Letter H With Dot Above Latin Capital Letter I With Dot Above Latin Small Letter I Latin Capital Letter M With Dot Above Latin Small Letter M With Dot Above ​ Latin Capital Letter N With Dot Above Latin Small Letter N With Dot Above
Note: In most languages i is the lower case version of I, but in Turkish İ and i resp. I and ı go together. If Turkish case is used, you need to make sure that various software handles that correctly. For example dictionaries need to sort the letters in the right order.
Ȯ ȯ Ȱ ȱ
U+022E U+022F U+0230 U+0231 U+1E56 U+1E57 U+1E58 U+1E59 U+1E60 U+1E61 U+1E9B U+1E64 U+1E65
Latin Capital Letter O With Dot Above Latin Small Letter O With Dot Above Latin Capital Letter O With Dot Above And Macron Latin Small Letter O With Dot Above And Macron Latin Capital Letter P With Dot Above Latin Small Letter P With Dot Above Latin Capital Letter R With Dot Above Latin Small Letter R With Dot Above Latin Capital Letter S With Dot Above Latin Small Letter S With Dot Above Latin Small Letter Long S With Dot Above Latin Capital Letter S With Acute And Dot Above Latin Small Letter S With Acute And Dot Above
Ż
U+1E66 U+1E67 U+1E68 U+1E69 U+1E6A U+1E6B U+1E86 U+1E87 U+1E8A U+1E8B U+1E8E U+1E8F U+017B
Latin Capital Letter S With Caron And Dot Above Latin Small Letter S With Caron And Dot Above Latin Capital Letter S With Dot Below And Dot Above Latin Small Letter S With Dot Below And Dot Above Latin Capital Letter T With Dot Above Latin Small Letter T With Dot Above Latin Capital Letter W With Dot Above Latin Small Letter W With Dot Above Latin Capital Letter X With Dot Above Latin Small Letter X With Dot Above Latin Capital Letter Y With Dot Above Latin Small Letter Y With Dot Above Latin Capital Letter Z With Dot Above
ż
U+017C
Latin Small Letter Z With Dot Above
Uses of Dot Above
Use Language Letters Notes
Raised vowel Livonian Ȯȯ /ʊ/, Ȱȱ /ʊː/

Macron

Precomposed Letters with Macron
¯ ˉ ◌̄ Ā ā Ǟ ǟ Ǡ ǡ Ǣ ǣ Ē ē
U+00AF U+02C9 U+0304 U+0100 U+0101 U+01DE U+01DF U+01E0 U+01E1 U+01E2 U+01E3 U+0112 U+0113
Macron Modifier Letter Macron Combining Macron Latin Capital Letter A With Macron Latin Small Letter A With Macron Latin Capital Letter A With Diaeresis And Macron Latin Small Letter A With Diaeresis And Macron Latin Capital Letter A With Dot Above And Macron Latin Small Letter A With Dot Above And Macron Latin Capital Letter Ae With Macron Latin Small Letter Ae With Macron Latin Capital Letter E With Macron Latin Small Letter E With Macron
Note: May be confused with Overline ‾ (U+203E), Combining Double Macron ◌͞ (U+035E) or Superscript Minus ⁻ (U+207B).
Ī ī Ō ō Ǭ
U+1E14 U+1E15 ​ U+1E16 U+1E17 U+1E20 U+1E21 U+012A U+012B U+1E38 U+1E39 U+014C ​ U+014D U+01EC
Latin Capital Letter E With Macron And Grave Latin Small Letter E With Macron And Grave Latin Capital Letter E With Macron And Acute Latin Small Letter E With Macron And Acute Latin Capital Letter G With Macron Latin Small Letter G With Macron Latin Capital Letter I With Macron Latin Small Letter I With Macron Latin Capital Letter L With Dot Below And Macron Latin Small Letter L With Dot Below And Macron Latin Capital Letter O With Macron Latin Small Letter O With Macron Latin Capital Letter O With Ogonek And Macron
ǭ Ȫ ȫ Ȭ ȭ Ȱ ȱ
U+01ED U+1E50 U+1E51 U+1E52 U+1E53 U+022A U+022B U+022C U+022D U+0230 U+0231 U+1E5C U+1E5D
Latin Small Letter O With Ogonek And Macron Latin Capital Letter O With Macron And Grave Latin Small Letter O With Macron And Grave Latin Capital Letter O With Macron And Acute Latin Small Letter O With Macron And Acute Latin Capital Letter O With Diaeresis And Macron Latin Small Letter O With Diaeresis And Macron Latin Capital Letter O With Tilde And Macron Latin Small Letter O With Tilde And Macron Latin Capital Letter O With Dot Above And Macron Latin Small Letter O With Dot Above And Macron Latin Capital Letter R With Dot Below And Macron Latin Small Letter R With Dot Below And Macron
Ū ū Ǖ ǖ Ȳ ȳ
U+016A U+016B U+1E7A U+1E7B U+01D5 U+01D6 U+0232 U+0233
Latin Capital Letter U With Macron Latin Small Letter U With Macron Latin Capital Letter U With Macron And Diaeresis Latin Small Letter U With Macron And Diaeresis Latin Capital Letter U With Diaeresis And Macron Latin Small Letter U With Diaeresis And Macron Latin Capital Letter Y With Macron Latin Small Letter Y With Macron
Uses of Macron
Use Language Letters Notes
Long vowel Latgalian Āā /ɑː/, Ēē /eː/, Īī /iː/, Ōō /oː/, Ūū /uː/
Latvian Āā /ɑː/, Ēē /eː/ and /æː/, Īī /iː/, Ūū /uː/
Livonian Āā /ɑː/, Ǟǟ /æː/, Ēē /ɛː/, Īī /iː/, Ōō /oː/, Ȱȱ /ʊː/, Ȭȭ /ɨː/, Ūū /u/

Ring Above

Precomposed Letters with Ring Above
˚ ◌̊ Å å Ǻ ǻ Ů ů
U+02DA U+030A U+00C5 U+00E5 U+01FA U+01FB U+016E U+016F U+1E98 U+1E99
Ring Above Combining Ring Above Latin Capital Letter A With Ring Above Latin Small Letter A With Ring Above Latin Capital Letter A With Ring Above And Acute Latin Small Letter A With Ring Above And Acute Latin Capital Letter U With Ring Above Latin Small Letter U With Ring Above Latin Small Letter W With Ring Above Latin Small Letter Y With Ring Above
Note: May be confused with the Degree Sign ° (U+00B0) Note: May be confused with the Ångström Sign Å (U+212B).
Uses of Ring Above
Use Language Letters Notes
Back version of front vowel. Often also rounded. Chamorro Åå /ɑ/
Danish, Norwegian Åå /ɔ/ From an earlier digraph aa representing /ɔ/, which in turn came from /aː/.[5]
Swedish Åå /o/ From an earlier digraph aa representing /ɔ/, which in turn came from /aː/.[6]

Tilde

Precomposed Letters with Tilde
~ ˜ ​ ◌̃ Ã ã
U+007E U+02DC U+0303 ​ U+00C3 U+00E3 ​ U+1EAA U+1EAB U+1EB4 U+1EB5 U+1EBC U+1EBD U+1EC4 U+1EC5
Tilde Small Tilde Combining Tilde Latin Capital Letter A With Tilde Latin Small Letter A With Tilde Latin Capital Letter A With Circumflex And Tilde Latin Small Letter A With Circumflex And Tilde Latin Capital Letter A With Breve And Tilde Latin Small Letter A With Breve And Tilde Latin Capital Letter E With Tilde Latin Small Letter E With Tilde Latin Capital Letter E With Circumflex And Tilde Latin Small Letter E With Circumflex And Tilde
Ĩ ĩ Ñ ñ Õ õ Ȭ ȭ
U+0128 U+0129 ​ U+00D1 U+00F1 U+00D5 U+00F5 U+022C U+022D U+1E4C U+1E4D U+1E4E U+1E4F U+1ED6
Latin Capital Letter I With Tilde Latin Small Letter I With Tilde Latin Capital Letter N With Tilde Latin Small Letter N With Tilde Latin Capital Letter O With Tilde Latin Small Letter O With Tilde Latin Capital Letter O With Tilde And Macron Latin Small Letter O With Tilde And Macron Latin Capital Letter O With Tilde And Acute Latin Small Letter O With Tilde And Acute Latin Capital Letter O With Tilde And Diaeresis Latin Small Letter O With Tilde And Diaeresis Latin Capital Letter O With Circumflex And Tilde
Ũ ũ
U+1ED7 U+1EE0 U+1EE1 U+0168 U+0169 U+1E78 U+1E79 U+1EEE U+1EEF U+1E7C U+1E7D U+1EF8 U+1EF9
Latin Small Letter O With Circumflex And Tilde Latin Capital Letter O With Horn And Tilde Latin Small Letter O With Horn And Tilde Latin Capital Letter U With Tilde Latin Small Letter U With Tilde Latin Capital Letter U With Tilde And Acute Latin Small Letter U With Tilde And Acute Latin Capital Letter U With Horn And Tilde Latin Small Letter U With Horn And Tilde Latin Capital Letter V With Tilde Latin Small Letter V With Tilde Latin Capital Letter Y With Tilde Latin Small Letter Y With Tilde
Uses of Tilde
Use Language Letters Notes
Unrounded vowel Estonian Õõ /ɤ/
Other Livonian Õõ /ɨ/, Ȭȭ /ɨː/