Natlang Uses of Diacritics in the Latin Alphabet: Difference between revisions

From FrathWiki
Jump to navigationJump to search
No edit summary
mNo edit summary
 
(214 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{WIP}}
This is a collection of articles that list different uses of diacritical marks that have natlang precedence. Conlangers can use this to find inspiration for their own conlang's orthography or transliteration. These articles could also be used as reference for those designing a keyboard layout.<br>
This page will list different uses of diacritical marks that have natlang precedence. Conlangers can use this to find inspiration for their own conlang romanizations.
<br>
:Note that in this article combining diacritics are attached to a ◌. Diacritics without a ◌, like ¨ for example, are non-combining. Non-combining diacritics are sometimes called modifier letters in Unicode.
Conlangs and transcription systems are also included in these articles. If you want to contribute with your own conlangs, or natlang examples, please read first the design guidelines on the [[Talk:Natlang_Uses_of_Diacritics_in_the_Latin_Alphabet|talk page]].


== Caron ==
{| class="wikitable"
{| class="wikitable"
|+ Precomposed Letters with Caron
|+ List of Articles on Natlang Usage of Latin Alphabet Diacritics
| style="font-size:180%" | ˇ || style="font-size:180%" | ◌̌ || style="font-size:180%" | Ǎ || style="font-size:180%" | ǎ || style="font-size:180%" | Č || style="font-size:180%" | č || style="font-size:180%" | Ď || style="font-size:180%" | ď || style="font-size:180%" | DŽ || style="font-size:180%" | Dž || style="font-size:180%" | dž || style="font-size:180%" | Ě || style="font-size:180%" | ě
! Diacritic name !! Other names !! Character !! Notes
|-
|-
| U+02C7 || U+030C || U+01CD || U+01CD || U+010C || U+010D || U+010E || U+010F || U+01C4 || U+01C5 || U+01C6 || U+011A || U+011B
| [[Acute_Accent|Acute accent]] || Kreska || style="font-size:180%" | ˊ || Although Łł is considered to be an Ll with kreska in Polish typography, this letter is listed under [[Stroke|stroke]].
|-
|-
| Caron || Combining Caron || Latin Letter Capital A With Caron || Latin Letter Small A With Caron || Latin Capital Letter C With Caron || Latin Small Letter C With Caron || Latin Capital Letter D With Caron || Latin Small Letter D With Caron || Latin Capital Letter Dz With Caron || Latin Capital Letter D With Small Letter Z With Caron || Latin Small Letter Dz With Caron || Latin Capital Letter E With Caron || Latin Small Letter E With Caron
| [[Acute_Accent_Below|Acute accent below]] || || style="font-size:180%" | ˏ ||
|-
|-
| colspan="2" | '''Note:''' May be confused with Modifier Letter Down Arrowhead ˅ (U+02C5). || || || || || || '''Note:''' The caron looks actually like an apostrophe placed to the right of the ascender of the d. || || || || ||
| [[Bar]] || Stroke, horizontal bar, middle tilde || style="font-size:180%" | ◌̵ || Eth (Ðð) and capital African D (Ɖ) are listed here. See also [[Stroke|stroke]].  
|-
|-
| style="font-size:180%" | Ǧ || style="font-size:180%" | ǧ || style="font-size:180%" | Ȟ || style="font-size:180%" | ȟ || style="font-size:180%" | Ǐ || style="font-size:180%" | ǐ || style="font-size:180%" | ǰ || style="font-size:180%" | Ǩ || style="font-size:180%" | ǩ || style="font-size:180%" | Ľ || style="font-size:180%" | ľ || style="font-size:180%" | Ň || style="font-size:180%" | ň
| [[Breve]] || || style="font-size:180%" | ˘ ||
|-
|-
| U+01E6 || U+01E7 || U+021E || U+021F || U+01CF || U+01D0 || U+01F0 ||​ U+01E8 || U+01E9 || U+013D || U+013E || U+0147 || U+0148
| [[Breve_Below|Breve below]] || || style="font-size:180%" | ◌̮ ||
|-
|-
| Latin Capital Letter G With Caron || Latin Small Letter G With Caron || Latin Capital Letter H With Caron || Latin Small Letter H With Caron || Latin Capital Letter I With Caron || Latin Small Letter I With Caron || Latin Small Letter J With Caron || Latin Capital Letter K With Caron || Latin Small Letter K With Caron || Latin Capital Letter L With Caron || Latin Small Letter L With Caron || Latin Capital Letter N With Caron || Latin Small Letter N With Caron
| [[Candrabindu]] || Chandrabindu, chandravindu, candravindu, chôndrobindu || style="font-size:180%" | ◌̐ ||
|-
|-
| || || || || || || || || || colspan="2" | '''Note:''' The caron looks actually like an apostrophe placed to the right of the ascender of the Ll. || ||
| [[Caron]] || Háček, haček || style="font-size:180%" | ˇ ||
|-
|-
| style="font-size:180%" | Ǒ || style="font-size:180%" | ǒ || style="font-size:180%" | Ř || style="font-size:180%" | ř || style="font-size:180%" | Š || style="font-size:180%" | š || style="font-size:180%" | Ṧ || style="font-size:180%" | ṧ || style="font-size:180%" | Ť || style="font-size:180%" | ť || style="font-size:180%" | Ǔ || style="font-size:180%" | ǔ || style="font-size:180%" | Ǚ
| [[Cedilla]] || || style="font-size:180%" | ¸ || Some of the letters included here have in practice comma below, but Ss and Tt with comma below are listed under [[Comma_Below|comma below]].
|-
|-
| U+01D1 || U+01D2 || U+0158 || U+0159 || U+0160 || U+0161 || U+1E66 || U+1E67 || U+0164 || U+0165 || U+01D3 || U+01D4 || U+01D9
| [[Circumflex]] || || style="font-size:180%" | ˆ ||
|-
|-
| Latin Capital Letter O With Caron || Latin Small Letter O With Caron || Latin Capital Letter R With Caron || Latin Small Letter R With Caron || Latin Capital Letter S With Caron || Latin Small Letter S With Caron || Latin Capital Letter S With Caron And Dot Above || Latin Small Letter S With Caron And Dot Above || Latin Capital Letter T With Caron || Latin Small Letter T With Caron || Latin Capital Letter U With Caron || Latin Small Letter U With Caron || Latin Capital Letter U With Diaeresis And Caron
| [[Circumflex_Below|Circumflex below]] || || style="font-size:180%" | ◌̭ ||
|-
|-
| || || || || || || || || || '''Note:''' The caron looks actually like an apostrophe placed to the right of the ascender of the t. || || ||
| [[Comma_Above|Comma above]] || Psilòn pneûma, psilon pneuma, psilí, psili, spīritus lēnis, spiritus lenis || style="font-size:180%" | ᾿ || Actually a greek script diacritic, but used in some Latin alphabets.
|-
|-
| style="font-size:180%" | ǚ || style="font-size:180%" | Ž || style="font-size:180%" | ž || style="font-size:180%" | Ǯ || style="font-size:180%" | ǯ
| [[Comma_Above_Right|Comma above right]] || || style="font-size:180%" | ◌̕ ||
|-
|-
| U+01DA || U+017D || U+017E || U+01EE || U+01EF
| [[Comma_Below|Comma below]] || || style="font-size:180%" | ◌̦ || This article includes Ss and Tt with comma below, but not other letters containing a comma looking diacritic. Instead, see [[Cedilla|cedilla]].
|-
|-
| Latin Small Letter U With Diaeresis And Caron || Latin Capital Letter Z With Caron || Latin Small Letter Z With Caron || Latin Capital Letter Ezh With Caron || Latin Small Letter Ezh With Caron
| [[Descender]] || || style="font-size:180%" | ||
|}
Caron is also known as háček or haček. Note that the caron is easily confused with the similar looking Breve ˘, especially in small font sizes.
 
{| class="wikitable"
|+ Uses of Caron
! Usage
! Language
! Letters
! Notes
|-
| rowspan=2 | Postalveolar consonant
| [[Wikipedia:Latgalian_language|Latgalian]], [[Wikipedia:Latvian_language|Latvian]]
| Čč /tʃ/, Šš /ʃ/, Žž /ʒ/
| Unaccented Cc stands for /ts/ in Latvian and Latgalian.
|-
| [[Wikipedia:Livonian_language|Livonian]]
| Šš /ʃ/, Žž /ʒ/
|
|}
 
== Cedilla ==
{| class="wikitable"
|+ Precomposed Letters with Cedilla
| style="font-size:180%" | ¸ || style="font-size:180%" | ◌̧ || style="font-size:180%" | Ç || style="font-size:180%" | ç || style="font-size:180%" | Ḉ || style="font-size:180%" | ḉ || style="font-size:180%" | Ḑ || style="font-size:180%" | ḑ || style="font-size:180%" | Ȩ || style="font-size:180%" | ȩ || style="font-size:180%" | Ḝ || style="font-size:180%" | ḝ || style="font-size:180%" | Ģ
|-
| U+00B8 || U+0327 || U+00C7 || U+00E7 || U+1E08 || U+1E09 || U+1E10 || U+1E11 || U+0228 || U+0229 || U+1E1C || U+1E1D || U+0122
|-
| Cedilla || Combining Cedilla || Latin Capital Letter C With Cedilla || Latin Small Letter C With Cedilla || Latin Capital Letter C With Cedilla And Acute || Latin Small Letter C With Cedilla And Acute || Latin Capital Letter D With Cedilla || Latin Small Letter D With Cedilla || Latin Capital Letter E With Cedilla || Latin Small Letter E With Cedilla || Latin Capital Letter E With Cedilla And Breve || Latin Small Letter E With Cedilla And Breve || Latin Capital Letter G With Cedilla
|-
| style="font-size:180%" | ģ || style="font-size:180%" | Ḩ || style="font-size:180%" | ḩ || style="font-size:180%" | Ķ || style="font-size:180%" | ķ || style="font-size:180%" | Ļ || style="font-size:180%" | ļ || style="font-size:180%" | Ņ || style="font-size:180%" | ņ || style="font-size:180%" | Ŗ || style="font-size:180%" | ŗ || style="font-size:180%" | Ş || style="font-size:180%" | ş
|-
| U+0123 || U+1E28 || U+1E29 || U+0136 || U+0137 || U+013B || U+013C || U+0145 || U+0146 || U+0156 || U+0157 || U+015E || U+015F
|-
| Latin Small Letter G With Cedilla || Latin Capital Letter H With Cedilla || Latin Small Letter H With Cedilla || Latin Capital Letter K With Cedilla || Latin Small Letter K With Cedilla || Latin Capital Letter L With Cedilla || Latin Small Letter L With Cedilla || Latin Capital Letter N With Cedilla || Latin Small Letter N With Cedilla || Latin Capital Letter R With Cedilla || Latin Small Letter R With Cedilla || Latin Capital Letter S With Cedilla || Latin Small Letter S With Cedilla
|-
|-
| '''Note:''' The diacritic is placed on top of the letter to avoid the descender of the g. || || || || || || || ||​ || || || || ||
| [[Diaeresis_and_Umlaut|Diaeresis/umlaut]] || Tréma, trema || style="font-size:180%" | ¨ ||
|-
|-
| style="font-size:180%" | Ţ || style="font-size:180%" | ţ
| [[Diaeresis_Below|Diaeresis below]] || || style="font-size:180%" | ◌̤ ||
|-
|-
| U+0162 || U+0163
| [[Dot_Above|Dot above]] || Overdot, anusvāra, anusvara || style="font-size:180%" | ˙ ||
|-
|-
| Latin Capital Letter T With Cedilla || Latin Small Letter T With Cedilla
| [[Dot_Above_Right|Dot above right]] || || style="font-size:180%" | ◌͘ ||
|}
 
{| class="wikitable"
|+ Uses of Cedilla
! Usage
! Language
! Letters
! Notes
|-
| rowspan=2 | Palatal consonant
| [[Wikipedia:Latgalian_language|Latgalian]], [[Wikipedia:Latvian_language|Latvian]]
| Ģģ /ɟ/, Ķķ /c/, Ļļ /ʎ/, Ņņ /ɲ/
|
|-
| [[Wikipedia:Livonian_language|Livonian]]
| Ḑḑ /ɟ/, Ļļ /ʎ/, Ņņ /ɲ/, Ţţ /c/
|
|-
| rowspan=1 | Palatalized consonant
| [[Wikipedia:Livonian_language|Livonian]]
| Ŗŗ /rʲ/
|
|}
 
== Diaeresis/Umlaut ==
{| class="wikitable"
|+ Precomposed Letters with Diaeresis/Umlaut
| style="font-size:180%" | ¨ || style="font-size:180%" | ◌̈ || style="font-size:180%" | Ä || style="font-size:180%" | ä || style="font-size:180%" | Ǟ || style="font-size:180%" | ǟ || style="font-size:180%" | Ë || style="font-size:180%" | ë || style="font-size:180%" | Ḧ || style="font-size:180%" | ḧ || style="font-size:180%" | Ï || style="font-size:180%" | ï || style="font-size:180%" |
|-
|-
| U+00A8 || U+0308 || U+00C4 || U+00E4 || U+01DE || U+01DF || U+00CB || U+00EB || U+1E26 || U+1E27 || U+00CF ||​ U+00EF || U+1E2E
| [[Dot_Below|Dot below]] || Underdot || style="font-size:180%" | ◌̣ ||
|-
|-
| Diaeresis || Combining Diaeresis || Latin Capital Letter A With Diaeresis || Latin Small Letter A With Diaeresis || Latin Capital Letter A With Diaeresis And Macron || Latin Small Letter A With Diaeresis And Macron || Latin Capital Letter E With Diaeresis || Latin Small Letter E With Diaeresis || Latin Capital Letter H With Diaeresis || Latin Small Letter H With Diaeresis || Latin Capital Letter I With Diaeresis || Latin Small Letter I With Diaeresis || Latin Capital Letter I With Diaeresis And Acute
| [[Double_Acute_Accent|Double acute accent]] || Hungarumlaut || style="font-size:180%" | ˝ ||
|-
|-
| style="font-size:180%" | ḯ || style="font-size:180%" | Ö || style="font-size:180%" | ö || style="font-size:180%" | Ȫ || style="font-size:180%" | ȫ || style="font-size:180%" | Ṏ || style="font-size:180%" | ṏ || style="font-size:180%" | ẗ || style="font-size:180%" | Ü || style="font-size:180%" | ü || style="font-size:180%" | Ǖ || style="font-size:180%" | ǖ || style="font-size:180%" | Ǘ
| [[Double_Grave_Accent|Double grave accent]] || || style="font-size:180%" |​ ◌̏ ||
|-
|-
| U+1E2F || U+00D6 || U+00F6 ||​ U+022A || U+022B || U+1E4E || U+1E4F || U+1E97 || U+00DC ||​ U+00FC || U+01D5 || U+01D6 || U+01D7
| [[Double_Macron_Below|Double macron below]] || || style="font-size:180%" | ◌͟◌ || This diacritic is very similar to [[Low_Line|low line]].
|-
|-
| Latin Small Letter I With Diaeresis And Acute ||​ Latin Capital Letter O With Diaeresis || Latin Small Letter O With Diaeresis ||​ Latin Capital Letter O With Diaeresis And Macron || Latin Small Letter O With Diaeresis And Macron || Latin Capital Letter O With Tilde And Diaeresis || Latin Small Letter O With Tilde And Diaeresis || Latin Small Letter T With Diaeresis || Latin Capital Letter U With Diaeresis || Latin Small Letter U With Diaeresis || Latin Capital Letter U With Diaeresis And Macron || Latin Small Letter U With Diaeresis And Macron ||​ Latin Capital Letter U With Diaeresis And Acute
| [[Double_Ring_Below|Double ring below]] || || style="font-size:180%" |​ ◌͚ ||
|-
|-
| style="font-size:180%" | ǘ || style="font-size:180%" | Ǚ || style="font-size:180%" | ǚ || style="font-size:180%" | Ǜ || style="font-size:180%" | ǜ || style="font-size:180%" | Ṻ || style="font-size:180%" | ṻ || style="font-size:180%" | Ẅ || style="font-size:180%" | ẅ || style="font-size:180%" | Ẍ || style="font-size:180%" | ẍ || style="font-size:180%" | Ÿ || style="font-size:180%" | ÿ
| [[Double_Vertical_Line_Above|Double vertical line above]] || || style="font-size:180%" |​ ◌̎ ||
|-
|-
| U+01D8 || U+01D9 || U+01DA || U+01DB || U+01DC || U+1E7A || U+1E7B || U+1E84 || U+1E85 || U+1E8C || U+1E8D || U+0178 || U+00FF
| [[Grave_Accent|Grave accent]] || || style="font-size:180%" | ˋ ||
|-
|-
| Latin Small Letter U With Diaeresis And Acute || Latin Capital Letter U With Diaeresis And Caron || Latin Small Letter U With Diaeresis And Caron || Latin Capital Letter U With Diaeresis And Grave || Latin Small Letter U With Diaeresis And Grave || Latin Capital Letter U With Macron And Diaeresis || Latin Small Letter U With Macron And Diaeresis || Latin Capital Letter W With Diaeresis || Latin Small Letter W With Diaeresis || Latin Capital Letter X With Diaeresis || Latin Small Letter X With Diaeresis || Latin Capital Letter Y With Diaeresis || Latin Small Letter Y With Diaeresis
| [[Grave_Accent_Below|Grave accent below]] || || style="font-size:180%" | ˎ ||
|}
Diaeresis (known as tréma in French) and umlaut both employ the same character. But there is a difference of use between diaeresis and umlaut. Letters with umlaut stand for completely different sounds than their non-accented counterparts. For example in Swedish <o> represents /u/ while <ö> represents /ø/. Diaeresis on the other hand does not change the sound value of a letter, but instead marks that a vowel is not part of a diphthong or digraph.
 
{| class="wikitable"
|+ Uses of Diaeresis or Umlaut
! Usage
! Language
! Letters
! Notes
|-
|-
| rowspan=3 | Front version of back vowel
| [[Hook_Above|Hook above]] || Dấu hỏi || style="font-size:180%" | ◌̉ ||
| [[Wikipedia:Finnish_language|Finnish]]
| Ää /æ/, Öö /ø/
| Usage borrowed from Swedish.
|-
|-
| [[Wikipedia:Livonian_language|Livonian]]
| [[Horn]] || Dấu móc || style="font-size:180%" | ◌̛ ||
| Ää /æ/
|
|-
|-
| [[Wikipedia:Swedish_language|Swedish]]
| [[Inverted_Breve|Inverted breve]] || Arch || style="font-size:180%" | ◌̑ ||
| Ää /ɛ/, Öö /ø/
| The umlaut evolved from the letter e in the digraphs ae[http://en.wikipedia.org/wiki/%C3%84] and oe[http://en.wikipedia.org/wiki/%C3%96].
|-
|-
| Syllable break. When two vowel follow each other, a diaeresis on the second vowel indicates that the vowels are in two different syllables instead of forming a diphthong.
| [[Low_Line|Low line]] || Underline, underscore || style="font-size:180%" | ◌̲ || This diacritic is very similar to [[Macron_Below|macron below]] and [[Double_Macron_Below|double macron below]].
| [[Wikipedia:French_language|French]]
| Ëë, Ïï, Üü, Ÿÿ
|
|}
 
== Dot Above ==
{| class="wikitable"
|+ Precomposed Letter with Dot Above
| style="font-size:180%" | ˙ || style="font-size:180%" | ◌̇ || style="font-size:180%" | Ȧ || style="font-size:180%" | ȧ || style="font-size:180%" | Ǡ || style="font-size:180%" | ǡ || style="font-size:180%" | Ḃ || style="font-size:180%" | ḃ || style="font-size:180%" | Ċ || style="font-size:180%" | ċ || style="font-size:180%" | Ḋ || style="font-size:180%" | ḋ || style="font-size:180%" | Ė
|-
|-
| U+02D9 || U+0307 || U+0226 || U+0227 || U+01E0 || U+01E1 || U+1E02 || U+1E03 || U+010A || U+010B || U+1E0A || U+1E0B || U+0116
| [[Macron]] || || style="font-size:180%" | ˉ ||
|-
|-
| Dot Above || Combining Dot Above || Latin Capital Letter A With Dot Above || Latin Small Letter A With Dot Above || Latin Capital Letter A With Dot Above And Macron || Latin Small Letter A With Dot Above And Macron || Latin Capital Letter B With Dot Above || Latin Small Letter B With Dot Above || Latin Capital Letter C With Dot Above || Latin Small Letter C With Dot Above || Latin Capital Letter D With Dot Above || Latin Small Letter D With Dot Above || Latin Capital Letter E With Dot Above
| [[Macron_Below|Macron below]] || Line below, low macron || style="font-size:180%" | ˍ || See also [[Double_Macron_Below|double macron below]].
|-
|-
| style="font-size:180%" | ė || style="font-size:180%" | || style="font-size:180%" | ḟ || style="font-size:180%" | Ġ || style="font-size:180%" | ġ || style="font-size:180%" | Ḣ || style="font-size:180%" | ḣ || style="font-size:180%" | İ || style="font-size:180%" | i || style="font-size:180%" | Ṁ || style="font-size:180%" | ṁ || style="font-size:180%" | Ṅ || style="font-size:180%" | ṅ
| [[Middle_Dot|Middle dot]] || Interpunct, interpoint, centered dot, centred dot, space dot || style="font-size:180%" | · ||
|-
|-
| U+0117 || U+1E1E || U+1E1F || U+0120 || U+0121 || U+1E22 || U+1E23 || U+0130 || U+0069 ||​ U+1E40 || U+1E41 || U+1E44 || U+1E45
| [[Ogonek]] || || style="font-size:180%" | ˛ ||
|-
|-
| Latin Small Letter E With Dot Above || Latin Capital Letter F With Dot Above || Latin Small Letter F With Dot Above || Latin Capital Letter G With Dot Above || Latin Small Letter G With Dot Above || Latin Capital Letter H With Dot Above || Latin Small Letter H With Dot Above || Latin Capital Letter I With Dot Above || Latin Small Letter I || Latin Capital Letter M With Dot Above || Latin Small Letter M With Dot Above ||​ Latin Capital Letter N With Dot Above || Latin Small Letter N With Dot Above
| [[Palatalized_Hook|Palatalized hook]] || Palatal hook || style="font-size:180%" | ◌̡ ||
|-
|-
| || || || || || || || colspan="2" | '''Note:''' In most languages i is the lower case version of I, but in Turkish İ and i resp. I and ı go together. If Turkish case is used, you need to make sure that various software handles that correctly. For example dictionaries need to sort the letters in the right order. || || || ||
| [[Retroflex_Hook|Retroflex hook]] || Hook, tail || style="font-size:180%" | ◌̢ ||
|-
|-
| style="font-size:180%" | Ȯ || style="font-size:180%" | ȯ || style="font-size:180%" | Ȱ || style="font-size:180%" | ȱ || style="font-size:180%" | Ṗ || style="font-size:180%" | ṗ || style="font-size:180%" | Ṙ || style="font-size:180%" | ṙ || style="font-size:180%" | Ṡ || style="font-size:180%" | ṡ || style="font-size:180%" | ẛ || style="font-size:180%" | Ṥ || style="font-size:180%" | ṥ
| [[Right_Half_Ring|Right half ring]] || || style="font-size:180%" | ʾ ||
|-
|-
| U+022E || U+022F || U+0230 || U+0231 || U+1E56 || U+1E57 || U+1E58 || U+1E59 || U+1E60 || U+1E61 || U+1E9B || U+1E64 || U+1E65
| [[Ring_Above|Ring above]] || || style="font-size:180%" | ˚ ||
|-
|-
| Latin Capital Letter O With Dot Above || Latin Small Letter O With Dot Above || Latin Capital Letter O With Dot Above And Macron || Latin Small Letter O With Dot Above And Macron || Latin Capital Letter P With Dot Above || Latin Small Letter P With Dot Above || Latin Capital Letter R With Dot Above || Latin Small Letter R With Dot Above || Latin Capital Letter S With Dot Above || Latin Small Letter S With Dot Above || Latin Small Letter Long S With Dot Above || Latin Capital Letter S With Acute And Dot Above || Latin Small Letter S With Acute And Dot Above
| [[Ring_Below|Ring below]] || || style="font-size:180%" | ˳ ||
|-
|-
| style="font-size:180%" | Ṧ || style="font-size:180%" | || style="font-size:180%" | Ṩ || style="font-size:180%" | || style="font-size:180%" | Ṫ || style="font-size:180%" | ṫ || style="font-size:180%" | Ẇ || style="font-size:180%" | ẇ || style="font-size:180%" | Ẋ || style="font-size:180%" | ẋ || style="font-size:180%" | Ẏ || style="font-size:180%" | ẏ || style="font-size:180%" | Ż
| [[Stroke]] || Diagonal stroke, solidus, strikethrough || style="font-size:180%" | ◌̷ || [[Bar]] may also be called stroke. Eth (Ðð) is not listed here, but under [[Bar|bar]].
|-
|-
| U+1E66 || U+1E67 || U+1E68 || U+1E69 || U+1E6A || U+1E6B || U+1E86 || U+1E87 || U+1E8A || U+1E8B || U+1E8E || U+1E8F || U+017B
| [[Tilde]] || || style="font-size:180%" | ˜ ||
|-
|-
| Latin Capital Letter S With Caron And Dot Above || Latin Small Letter S With Caron And Dot Above || Latin Capital Letter S With Dot Below And Dot Above || Latin Small Letter S With Dot Below And Dot Above || Latin Capital Letter T With Dot Above || Latin Small Letter T With Dot Above || Latin Capital Letter W With Dot Above || Latin Small Letter W With Dot Above || Latin Capital Letter X With Dot Above || Latin Small Letter X With Dot Above || Latin Capital Letter Y With Dot Above || Latin Small Letter Y With Dot Above || Latin Capital Letter Z With Dot Above
| [[Tilde_Below|Tilde below]] || || style="font-size:180%" | ˷ ||
|-
|-
| style="font-size:180%" | ż
| [[Tilde_Overlay|Tilde overlay]] || || style="font-size:180%" | ◌̴ ||
|-
|-
| U+017C
| [[Vertical_Line_Above|Vertical line above]] || || style="font-size:180%" | ˈ ||
|-
|-
| Latin Small Letter Z With Dot Above
| [[Vertical_Line_Below|Vertical line below]] || || style="font-size:180%" | ˌ ||
|}
 
{| class="wikitable"
|+ Uses of Dot Above
! Use
! Language
! Letters
! Notes
|-
|-
| Raised vowel
| [[Vertical_Tilde|Vertical tilde]] || || style="font-size:180%" | ◌̾ ||
| [[Wikipedia:Livonian_language|Livonian]]
| Ȯȯ /ʊ/
|
|}
|}


== Ring Above ==
== Layout Overview ==
{| class="wikitable"
[[File:Caron_article.png|thumb|1138px|An example of a Unicode table, from the article [[Natlang_Uses_of_Caron|Natlang Uses of Caron]]. Notice character similarity warnings both in the article text above, and as a note in the table itself.]]
|+ Precomposed Letter with Ring Above
In these articles combining (non-spacing) diacritics are attached to a ◌. Diacritics without a ◌, like ¨ for example, are non-combining (spacing). Non-combining diacritics are sometimes called modifier letters in Unicode. The non-combining forms may for example be used when writing about a conlang's orthography, when one wants to refer to a diacritic without using any base letter with it. Some natlangs even use some diacritics as stand alone characters!<br>
| style="font-size:180%" | ˚ || style="font-size:180%" | ◌̊ || style="font-size:180%" | Å || style="font-size:180%" | å || style="font-size:180%" | Ǻ || style="font-size:180%" | ǻ || style="font-size:180%" | Ů || style="font-size:180%" | ů || style="font-size:180%" | ẘ || style="font-size:180%" | ẙ
<br>
|-
When a letter is referred to without concerning about case, it is displayed like so: Ťť. This is for clarity's sake because some diacritics may look different depending on the letter's case, as in the previous example. When either only an upper case or a lower case letter is used in an article, it usually refers to that specific case variant. But it can also refer to a character which has only one case.<br>
| U+02DA || U+030A || U+00C5 || U+00E5 || U+01FA || U+01FB || U+016E || U+016F || U+1E98 || U+1E99
<br>
|-
Sometimes it may be necessary to refer to a digraph, for example Ŀl in Catalan. When a digraph is referenced to without concerning about case, it is written like this: Ŀl ŀl; with a space between the letters. Different languages' orthographies may have different rules about capitalization of the first letter of a word. In most languages, only the first letter of a digraph is capitalized; but there are languages where both letters are capitalized. Which rule a particular orthography, that is examplified in these articles, follows, can thus be discerned from how the article writes the digraph.<br>
| Ring Above || Combining Ring Above || Latin Capital Letter A With Ring Above || Latin Small Letter A With Ring Above || Latin Capital Letter A With Ring Above And Acute || Latin Small Letter A With Ring Above And Acute || Latin Capital Letter U With Ring Above || Latin Small Letter U With Ring Above || Latin Small Letter W With Ring Above || Latin Small Letter Y With Ring Above
<br>
|-
These articles show first which precomposed letter plus diacritic combinations exist in Unicode, and what their codepoints and Unicode names are. The different forms of the stand alone diacritics are also shown. For example the [[Tilde|tilde]] has three different forms: An "ASCII form" ~, which is used in programming among other things, where the tilde is centered; a non-combining diacritic form ˜, where the tilde has the same position it would have when combined with a base letter; and a combining form ◌̃.<br>
| colspan="2" | '''Note:''' May be confused with the Degree Sign ° (U+00B0) || '''Note:''' May be confused with the Ångström Sign Å (U+212B). || || || || || || ||
<br>
|}
Many diacritics or accented letters look very similar to other characters, for example [[Caron|caron]] ˇ and [[Breve|breve]] ˘. These cases are warned about either in the text at the beginning of the article, or in notes at the table that lists the precomposed characters. It is desirable that all the diacritics in one orthography can be easily told apart, so conlangers devising new orthographies should be careful about this. A conlanger may also mistakenly copypaste a similar looking but wrong character from somewhere to a conlang project, so thereful the articles also list characters that would otherwise be unlikely to normally appear in the same orthography, such as Latin Capital Letter O With Stroke, Ø (U+00D8); and Empty Set, ∅ (U+2205) for example. Cases such as the one with caron ˇ and breve ˘, which concern essentially all characters with this accent, are notified about in the text at the beginning of the article. Cases such as Ø and ∅, which only concern an individual pair of characters, are notified about in the Unicode table.


{| class="wikitable"
{| class="wikitable"
Line 230: Line 131:
| Åå /o/
| Åå /o/
| From an earlier digraph aa representing /ɔ/, which in turn came from /aː/.[http://en.wikipedia.org/wiki/%C3%85]
| From an earlier digraph aa representing /ɔ/, which in turn came from /aː/.[http://en.wikipedia.org/wiki/%C3%85]
|-
| Long vowel
| [[Wikipedia:Czech_language|Czech]]
| Ůů /uː/
| This comes from a diphthong /uo/, where the o was sometimes written as a ring above the u. A sound change then turned /uo/ into /uː/.[http://en.wikipedia.org/wiki/Czech_orthography#Letter_.C5.AE]
|}
|}
After the precomposed characters have been presented, comes examples of how natlangs use the diacritic. (Natromanizations of other scripts are also included.) For each language, the letters and the phonemes they represent are listed. The notes may contain a short history of why the characters are used in this way in the given language. These explanations are very short though, so often times one can read more about it by clicking the reference link ([1], [2] or [3] above). The notes may also give more information about a characters usage, when it is not quite straight forward, or when it differs a little from the other characters in the same group.
==Further reading==
* [http://www.phon.ucl.ac.uk/home/wells/dia/diacritics-revised.htm J.C. Wells: ''Orthographic diacritics and multilingual computing'']
[[Category:Natscripts]]

Latest revision as of 01:42, 6 July 2021

This is a collection of articles that list different uses of diacritical marks that have natlang precedence. Conlangers can use this to find inspiration for their own conlang's orthography or transliteration. These articles could also be used as reference for those designing a keyboard layout.

Conlangs and transcription systems are also included in these articles. If you want to contribute with your own conlangs, or natlang examples, please read first the design guidelines on the talk page.

List of Articles on Natlang Usage of Latin Alphabet Diacritics
Diacritic name Other names Character Notes
Acute accent Kreska ˊ Although Łł is considered to be an Ll with kreska in Polish typography, this letter is listed under stroke.
Acute accent below ˏ
Bar Stroke, horizontal bar, middle tilde ◌̵ Eth (Ðð) and capital African D (Ɖ) are listed here. See also stroke.
Breve ˘
Breve below ◌̮
Candrabindu Chandrabindu, chandravindu, candravindu, chôndrobindu ◌̐
Caron Háček, haček ˇ
Cedilla ¸ Some of the letters included here have in practice comma below, but Ss and Tt with comma below are listed under comma below.
Circumflex ˆ
Circumflex below ◌̭
Comma above Psilòn pneûma, psilon pneuma, psilí, psili, spīritus lēnis, spiritus lenis ᾿ Actually a greek script diacritic, but used in some Latin alphabets.
Comma above right ◌̕
Comma below ◌̦ This article includes Ss and Tt with comma below, but not other letters containing a comma looking diacritic. Instead, see cedilla.
Descender
Diaeresis/umlaut Tréma, trema ¨
Diaeresis below ◌̤
Dot above Overdot, anusvāra, anusvara ˙
Dot above right ◌͘
Dot below Underdot ◌̣
Double acute accent Hungarumlaut ˝
Double grave accent ​ ◌̏
Double macron below ◌͟◌ This diacritic is very similar to low line.
Double ring below ​ ◌͚
Double vertical line above ​ ◌̎
Grave accent ˋ
Grave accent below ˎ
Hook above Dấu hỏi ◌̉
Horn Dấu móc ◌̛
Inverted breve Arch ◌̑
Low line Underline, underscore ◌̲ This diacritic is very similar to macron below and double macron below.
Macron ˉ
Macron below Line below, low macron ˍ See also double macron below.
Middle dot Interpunct, interpoint, centered dot, centred dot, space dot ·
Ogonek ˛
Palatalized hook Palatal hook ◌̡
Retroflex hook Hook, tail ◌̢
Right half ring ʾ
Ring above ˚
Ring below ˳
Stroke Diagonal stroke, solidus, strikethrough ◌̷ Bar may also be called stroke. Eth (Ðð) is not listed here, but under bar.
Tilde ˜
Tilde below ˷
Tilde overlay ◌̴
Vertical line above ˈ
Vertical line below ˌ
Vertical tilde ◌̾

Layout Overview

An example of a Unicode table, from the article Natlang Uses of Caron. Notice character similarity warnings both in the article text above, and as a note in the table itself.

In these articles combining (non-spacing) diacritics are attached to a ◌. Diacritics without a ◌, like ¨ for example, are non-combining (spacing). Non-combining diacritics are sometimes called modifier letters in Unicode. The non-combining forms may for example be used when writing about a conlang's orthography, when one wants to refer to a diacritic without using any base letter with it. Some natlangs even use some diacritics as stand alone characters!

When a letter is referred to without concerning about case, it is displayed like so: Ťť. This is for clarity's sake because some diacritics may look different depending on the letter's case, as in the previous example. When either only an upper case or a lower case letter is used in an article, it usually refers to that specific case variant. But it can also refer to a character which has only one case.

Sometimes it may be necessary to refer to a digraph, for example Ŀl in Catalan. When a digraph is referenced to without concerning about case, it is written like this: Ŀl ŀl; with a space between the letters. Different languages' orthographies may have different rules about capitalization of the first letter of a word. In most languages, only the first letter of a digraph is capitalized; but there are languages where both letters are capitalized. Which rule a particular orthography, that is examplified in these articles, follows, can thus be discerned from how the article writes the digraph.

These articles show first which precomposed letter plus diacritic combinations exist in Unicode, and what their codepoints and Unicode names are. The different forms of the stand alone diacritics are also shown. For example the tilde has three different forms: An "ASCII form" ~, which is used in programming among other things, where the tilde is centered; a non-combining diacritic form ˜, where the tilde has the same position it would have when combined with a base letter; and a combining form ◌̃.

Many diacritics or accented letters look very similar to other characters, for example caron ˇ and breve ˘. These cases are warned about either in the text at the beginning of the article, or in notes at the table that lists the precomposed characters. It is desirable that all the diacritics in one orthography can be easily told apart, so conlangers devising new orthographies should be careful about this. A conlanger may also mistakenly copypaste a similar looking but wrong character from somewhere to a conlang project, so thereful the articles also list characters that would otherwise be unlikely to normally appear in the same orthography, such as Latin Capital Letter O With Stroke, Ø (U+00D8); and Empty Set, ∅ (U+2205) for example. Cases such as the one with caron ˇ and breve ˘, which concern essentially all characters with this accent, are notified about in the text at the beginning of the article. Cases such as Ø and ∅, which only concern an individual pair of characters, are notified about in the Unicode table.

Uses of Ring Above
Use Language Letters Notes
Back version of front vowel. Often also rounded. Chamorro Åå /ɑ/
Danish, Norwegian Åå /ɔ/ From an earlier digraph aa representing /ɔ/, which in turn came from /aː/.[1]
Swedish Åå /o/ From an earlier digraph aa representing /ɔ/, which in turn came from /aː/.[2]
Long vowel Czech Ůů /uː/ This comes from a diphthong /uo/, where the o was sometimes written as a ring above the u. A sound change then turned /uo/ into /uː/.[3]

After the precomposed characters have been presented, comes examples of how natlangs use the diacritic. (Natromanizations of other scripts are also included.) For each language, the letters and the phonemes they represent are listed. The notes may contain a short history of why the characters are used in this way in the given language. These explanations are very short though, so often times one can read more about it by clicking the reference link ([1], [2] or [3] above). The notes may also give more information about a characters usage, when it is not quite straight forward, or when it differs a little from the other characters in the same group.

Further reading