Talk:Natlang Uses of Diacritics in the Latin Alphabet
I am gonna invite people to add more languages to this article once I have everything set up, and I'm grateful for everyone who wants to help. But people please, don't do any changes while WIP sign is still up.
Qwynegold 14:55, 18 January 2013 (PST)
I moved all the sections into their own articles instead. Please refrain from editing them until I have removed the WIP sign from this page.
Qwynegold 14:50, 19 January 2013 (PST)
If anyone knows how to put a table in a frame, please help. Go to [1], press Ctrl+F and enter The final table would display like this:. The table you see under these words is placed in a white box white grey outlines, that is, the box that also encircles the line The table's caption. I want a box like that around the natlang examples table in this page.
Qwynegold 13:05, 20 January 2013 (PST)
Layout Guidelines
This article should only list diacritics in the Latin alphabet, used in present or obsolete natlang orthographies or transliteration of orthographies, such as Pinyin for Chinese or Hepburn for Japanese. Links to every diacritic should have been made, although some of those pages have not yet been written. I have included acute accent below ˏ, grave accent below ˎ, and comma above right ◌̕, although I'm not sure if they are used in any natlangs or romanizations. I am also unsure about double low line ◌̳, double overline ◌̿, double macron ◌͞◌, double macron below ◌͟◌, and double tilde ◌͠◌, but I have not included those. One should know that it can be surprising which diacritics are actually used in natlangs sometimes. For example low line below ◌̲ is allegedly used in some African and Native American languages,[2] even though it and related characters are mostly only used in typesetting.[3]
Introductory Text
An article should preferrably begin with a short history of how the diacritic came to be. The introductory text should also tell if there are any other diacritics that this diacritic could be confused with, see below. The reason for bringing attention to this is so that a conlanger would not end up having two similar looking diacritics in the same orthography, which usually is undesireable. A conlanger may also copy-paste characters from some source into his or her project, and may end up copy-pasting a similar looking but wrong character unless attention is paid.
Various claims about any subject should have a reference, even if it's only to Wikipedia. When warning about similarities between different diacritics, there should be links to those diacritics' Natlang Uses of... articles. See the three layout principles dealing with links for a more thorough explanation and for examples of how to make the links.
The Unicode Table
{| class="wikitable" |+ Precomposed Letters with *** | style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | |- | U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ |- | Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter |- | style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | || style="font-size:180%" | |- | U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ || U+ |- | Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter || Latin Capital Letter || Latin Small Letter |- |}
Above is a template for the Unicode table. I intend to do all of these tables by myself, while others can contribute with filling in natlang data once I have set everything up. But here the table code is copy-pasteable anyway, and I will explain the layout principles behind it.
- The first row contains the letters themselves. These are in a bigger font so that they will be more clearly visible. There's a problem though with letters with stacked diacritics. Some of these diacritics are cut off by the table. If anyone can add more cell spacing to the table, I'd be thankful because I can't figure out how to do it myself. None of the code dealing with it has any effect.
- Anyhow, first comes the stand alone diacritics, in the order: ASCII form, modifier letter, combining accent, combining tone mark. To exemplify with acute accent, it would be in the order: Acute Accent, ´ (U+00B4); Modifier Letter Acute Accent, ˊ (U+02CA); Combining Acute Accent, ◌́ (U+0301); Combining Acute Tone Mark, ◌́ (U+0341). Not all diacritics have all of these four forms of course, so the ones that are missing are simply skipped.
- After the stand alone diacritics comes the actual letters in alphabetic order. Upper case letters precede their lower case variants. When it comes to letters with stacked diacritics, the letter with just one diacritic comes before the letters with several diacritics. For example Áá precedes Ǻǻ.
- It's a little unclear which order the letters should be placed when there are several letters with stacked diacritics. For example, how are Ǻǻ, Ấấ, Ắắ ordered with respect to each other? So far in these cases, they have mostly been ordered after their Unicode numbers.
- When it comes to Ææ and ſ, Ææ is placed after Aa but before Bb, and ſ after Ss but before Tt. So for example Ǽǽ comes after all the accented Aa letters. Natlang Uses of Acute Accent is a good example of how letters should be ordered, because it contains all the ordering issues brought up in this and the previous two paragraphs.
- The next row contains the Unicode numbers.
- After that comes the letters' Unicode names. The names and numbers are shown in Character Map when you highlight or hover over a letter.
- Lastly comes the row with notes. All notes begin with the word "Note:" in broad style.
- When a character is awfully similar to some other character, the note should say that. An example of such a note:
Note: May be confused with Apostrophe, ' (U+0027); Modifier Letter Prime, ʹ (U+02B9); Modifier Letter Turned Comma, ʻ (U+02BB); Modifier Letter Apostrophe, ʼ (U+02BC); Modifier Letter Vertical Line, ˈ (U+02C8); Right Single Quotation Mark, ’ (U+2019); or Prime, ′ (U+2032). |
So first comes the Unicode name of the character that is similar. This is followed by a comma, the given character, the Unicode number in parenthesis, and lastly a semicolon if there are more similar characters listed. So far warnings have only been given about similarities with other Latin characters, but characters from other scripts should probably also be included.
- When there are two diacritics that are similar to each other, and that are both used in precomposed letters (like breve ˘ and caron ˇ for example), the similarity warning should be in the text at the beginning of the article. This is because it concerns all the characters that employ this diacritic, so it would be awkward to write warnings under all letters, like Ă is similar to Ǎ, ă is similar to ǎ, Ĕ is similar to Ě, and so on.
- When a specific character is only used in some phonetic transcription, the note should say that. It should preferrably also tell in which transcription it is used. An example: "Note: Phonetic character used by Russianists.[4] Not used in any orthography."
- Other kinds of notes may also exist, for example directing attention towards some typographical issues concerning particular letters. One example is the note "Note: The caron looks actually like an apostrophe placed to the right of the ascender of the d." about the letter ď.
- Sometimes the same note may concern several letters in a row, like in the following case:
Ɵ | ɵ |
U+019F | U+0275 |
Latin Capital Letter O With Middle Tilde | Latin Small Letter Barred O |
Note: Despite their different names, these two letters are case variants of each other. |
In such a case colspan is used for making the note stretch across several columns.
- Notes should preferrably have citations, such as the Russianist phonetics example above. These kinds of links consist of a single set of [ ] with an URL inside.
- If a note refers to some other letter or diacritic for which there is an article for in these Natlang Uses of... series, that should have a link. In these cases we make a link out of a keyword like this: [[here is the name of the article, with _ replacing spaces|here is the keyword]]. An example:
Note: The capital versions of these letters may be confused with each other. Latin Small Letter D With Stroke and Latin Small Letter Eth may be confused with each other. The lower case version of Latin Capital Letter African D is Latin Small Letter D With Tail, ɖ (U+0256). |
The letter ɖ here links to the article Natlang Uses of Retroflex Hook.
- Keyword links may also be made to other sites, when deemed useful. For example:
Note: Phonetic character; not used in any orthography. Its use in IPA is non-standard.[5] |
- If none of the characters on one row need a note of any kind, then the whole note row is deleted so as to not have unnecessary empty space in the table.
- The table consists of 13 columns. The last set of rows often has fewer than 13 entries however, see for example Natlang Uses of Double Grave Accent. In such case, the unused cells at the end of the table are deleted. Sometimes a table may have fewer than 13 entries in total, in which case the whole table will of course have less than 13 columns.
The tables in the above list should be centered, but I just can't understand how to do it, so help is appreciated.
The Natlang Example Table
{| class="wikitable" |+ Uses of *** ! Usage ! Language ! Letters ! Notes |- | | | | |}
- Categories so far in use:
- Affrication
- Back version of front vowel. Often also rounded.
- Central vowel
- Change of manner of articulation
- Change of place of articulation
- Creaky voice
- Digraph disambiguation
- Diphthong
- Disambiguation of letter with several uses
- Falling tone
- Falling-rising (dipping) tone
- Following glottal stop
- Front version of back vowel (this includes Ää even though its unaccented version is not a back vowel in all of these languages)
- Glottalized vowel
- Hiatus
- Implosive consonant
- Lax vowel
- Letter extension
- Long vowel
- Long vowel with low pitch
- Long vowel with pitch accent
- Lowered vowel
- Lowered vowel with retracted tongue root
- Nasalized vowel
- Non-silent vowel
- Palatal consonant
- Palatal phoneme
- Palatalized consonant
- Postalveolar consonant
- Raised vowel
- Retroflex consonant
- Rising tone
- Short vowel
- Short vowel with pitch accent or tone
- Stress
- Syllabic consonant
- Unrounded central vowel
- Unrounded vowel
- Other
Phonetic transcription; naming
These would be OK to add I presume? Americanist, Uralicist and many smaller provincial transcription schemes tend to use diacritics somewhat more freely than the IPA does. Adding examples of conlang uses would also be good to have I think, so as to not just make this Wikipedia Deluxe.
And your naming scheme feels overwrought tbh. Why not put all the stuff about foos in just "Foo" instead of "Natlang Uses of Foo Above"? It's hard to imagine much else than a list of Unicode glyphs + natlang usage being included in a "base" article anyway.
(Also is there any reason to still keep this entire project in lockdown? WIP warnings are primarily meant for articles currently under an edit, not just ones you plan on doing something on at a point. I don't see a reason new data could not be added, at least.)