Conlang Recognition Chart: Difference between revisions

Latest revision as of 16:10, 21 April 2014

This article describes a variety of simple clues one can use to determine what conlang a document is written in with high accuracy.

Ayeri

Orthography: ptkcbdgjmnvshrly aāeēiīoōu (older transcriptions use c for k, newer transcription re-uses c for [ʧ] and j for [ʤ])

Non-ASCII: āēīōū (ạẹịọụ ạ̄ẹ̄ị̄ọ̄ụ̄ have also been used in older transcriptions)
Unused ASCII: fqwxyz

Dipthongs: au, ay, ey, oy (also, depending on transcription, uses 'iy' frequently, but that is no diphthong)
Common words: ang, sa, eng, le, si, yam, ya
Common morphemes: -ang, -as, -reng, -ley, -yam, -ea/-ya, -iya, -ara
Other common features: Words can get quite long due to its agglutinativeness.

Calénnawn

Non-ASCII: áéíóú àèìòù ë ðñ and $ or š
Unused ASCII: jk
Diphthongs: aw, iw, ow, ay, ey, oy, uy
Digraph: ii
x and q are common
Words starting with f- or s- (like f-qúba)
Words of more than one syllable contain at least one acute accent
Common one- and two-letter words: a, e, i, h, o, on, so, se, fh, el, en, iw, fa

Ebisédian

ASCII orthography:

Uses w, y, 3 and 0 as vowel letters
Upper- and lowercase consonants are distinct (e. g., K vs. k)
Use of double vowel letters to indicate length: 00, ww.
Use of apostrophe after vowels to indicate stress: 00', yy'.

LaTeX orthography:

Use of ø and ɜ as vowel letters
Multiple diacritics over single vowel letters, up to 4 (macron, acute, tear-drop accent, subscript tilde).
Subscript tilde to indicate nasality.
Tear-drop accent in vowel-initial words (looks like a superscript opening left single quote)

General:

Common single-word sentences with i in the last syllable.
Common words: Ke, ve, ke, je, re (always clause-final), keve, tømø, tɜmɜ, timi, tama, tumu.

Fínlǣsk

Non-Latin: ð, ȝ, æ, œ, ᵫ, ᛫, ƿ
Unused Latin: c, v, w
Doubled consonants are common
Diacritics: x́, x̄ (found with all vowels), x̨ (found with e, o, œ, u)

In ISO Latin-1 and other "plain text" formats, ä, ö, and ü can be found for æ, œ, and ᵫ respectively.

The earliest texts are often found in Runes, and texts between around 1200 and 1600 in an Insular Uncial style script. Scribal notae and abbreviations are common, especially in the earliest post-runic texts, with ideographic use (based on the Latin meanings of the sounds nominally represented by the marks) as well as straightforward phonetic use.

Germanech

Non-ASCII: Ää Öö Üü Éé
Digraphs: ch cj dj gj tj tz
Common words: a de ez ést héz il la las los

Klingon

Letters D H I S are always capitalised; letters a b ch e gh j l m n o p r t tlh u v w y are always lower-case. Letter q Q may appear in either case. No non-ASCII characters are used.
Unused ASCII: f k x z
c only appears in ch; g only appears in ng and gh; (lower-case) h only appears in ch and gh.
Fairly frequent use of the apostrophe
Unusual trigraph tlh
Common affixes: vI- yI-; -be' -'a' -moH -laH -mey -taHvIS -wI'
Common words: 'oH 'ej 'ach je neH 'e'

Minza

Non-ASCII: ċ č ł ŋ ö ř š ż ž
Unused ASCII: q w x
Digraphs: ch, gh
Combinations: ië, yö, uö, öy, -h after vowels, łř, nř
Common words: ai, ba, ċi, die, en, fi, ida, ja, kam, keh, ła, łu, min, nu, öych, ři, šei, šö, vö, yn, zmi

Qþyn|gài

Non-ASCII: Þþ|ǂáíúýàìùỳ
Unused ASCII: bpmfvweoczj
Combinations: nq qþ rq ql tl hh nǂg n!g n||g ǂk ái áu úi íu ài àu ùi ìu
All words start with a consonant and end with a vowel
Very long words

Regimonti

Regimonti is a Romance language with vocabulary based on classical Latin rather than Vulgar Latin.

Its name is "Rumanşa" in Regimonti
Latin Alphabet with three additional characters: è, ņ, ş which make the following sounds: /E/ /J/ /S/ respectively.
Diphthongs: ai, au, oi, ua
common words: unu, una, lu, la. First person singular pronoun: O
Listen to the Babel Text in mp3 format

Sasxsek

7-bit ASCII characters only.
All upper case or all lower case letters, no mixed case.
Unused punctuation symbols: ; " ? !
Unused letters: C, Y.
No doubled letters.
Empenthetic X (=/@/) used to in compounds.
Single bracket quotes: < >
Apostrophe to break up numbers or long words to make them more readable: 1'000'000
Colon used for abbreviations: k:m: (=kilxmitros)
Proper name marker "li".

http://www.nutter.net/sasxsek

Tatari Faran

Uses subset of Latin alphabet: a, b, d, e, f, h, i, j, k, m, n, o, p, r, s, t, u.
Unused letters: c, g, l, q, v, w, x, y, z.
No capitalization, not even in proper names.
Glottal stop in words, indicated by apostrophe (').
ts used as a digraph.
d is always word-initial, and r is always medial.
The only consonant clusters are double consonants beginning with m or n.
Common words: ka, kei, ko, sa, sei, so, na, nei, no, ei (never at the beginning of a sentence); e (never at the end of a sentence); da (always follows a word ending in -n).

Terzemian

Latin script

Non-ASCII: Ää Åå Čč Ǧǧ Ġġ G̐g̐ Ḳḳ Łł Ł̣ł̣ Ňň Öö Šš Üü Žž ʼ ˚ (bolded letters are apparently unique)
Unused ASCII: Jj Qq (except in foreign names)

Þrjótrunn

Non-ASCII: ÁÐÉÍÓÚÝÞÆÖáðéíóúýþæö
Unused ASCII: cqz
Combinations: pp tt kk gj ggj kj kkj
Frequent words: ún únn á í eð er þiss þissi þissa

Tauro-Piscean

Latin: a b d e f g h i j k l m n o p r s t u v w z
Unused Latin: c q x y
Diacritics: ẍ x́ x̀ x̆ x̄ (found with all vowels), x̂ (found with e g s)
There are no digraphs
Doubled consonants used often, doubled vowels never used
Common words: an, tet, habb, zï, heonan, tonan, te, jo, Mann

@@ Line 1: / Line 1: @@
 This article describes a variety of simple clues one can use to determine what conlang a document is written in with high accuracy.
-==[http://www.beckerscarsten.de/conlang/ayeri/ Ayeri]==
+==[http://benung.nfshost.com Ayeri]==
-*Orthography: ''ptkbdgmnvshrly aāeēiīoōuū'' (older transcriptions use ''c'' for ''k'')
+*Orthography: ''ptkcbdgjmnvshrly aāeēiīoōu'' (older transcriptions use ''c'' for ''k'', newer transcription re-uses ''c'' for [ʧ] and ''j'' for [ʤ])
 :*Non-ASCII: ''āēīōū'' (''ạẹịọụ ạ̄ẹ̄ị̄ọ̄ụ̄'' have also been used in older transcriptions)
-:*Unused ASCII: ''cfjqwxyz''
+:*Unused ASCII: ''fqwxyz''
-*Dipthongs: ''au, ay, ey, oy'' (also uses 'iy' frequently, but that is no diphthong)
+*Dipthongs: ''au, ay, ey, oy'' (also, depending on transcription, uses 'iy' frequently, but that is no diphthong)
-*Common words: ''ang, sira, eng, le, si, yam, ya''
+*Common words: ''ang, sa, eng, le, si, yam, ya''
-*Common morphemes: ''-ang, -aris, -reng, -ley, -yam, -ea, -iya, -ara, -in, -on''
+*Common morphemes: ''-ang, -as, -reng, -ley, -yam, -ea/-ya, -iya, -ara''
 *Other common features: Words can get quite long due to its agglutinativeness.
@@ Line 34: / Line 34: @@
 :* Common single-word sentences with ''i'' in the last syllable.
 :* Common words: ''Ke'', ''ve'', ''ke'', ''je'', ''re'' (always clause-final), ''keve'', ''tømø'', ''tɜmɜ'', ''timi'', ''tama'', ''tumu''.
+==[[Fínlǣsk]]==
+* Non-Latin: ð, ȝ, æ, œ, ᵫ, ᛫, ƿ
+* Unused Latin: c, v, w
+* Doubled consonants are common
+* Diacritics: x́, x̄ (found with all vowels), x̨ (found with e, o, œ, u)
+In ISO Latin-1 and other "plain text" formats, ä, ö, and ü can be found for æ, œ, and ᵫ respectively.
+The earliest texts are often found in Runes, and texts between around 1200 and 1600 in an Insular Uncial style script. Scribal notae and abbreviations are common, especially in the earliest post-runic texts, with ideographic use (based on the Latin meanings of the sounds nominally represented by the marks) as well as straightforward phonetic use.
 ==[[Germanech]]==
@@ Line 39: / Line 50: @@
 * Digraphs: ''ch cj dj gj tj tz''
 * Common words: ''a de ez ést héz il la las los''
+==Klingon==
+* Letters D H I S are always capitalised; letters a b ch e gh j l m n o p r t tlh u v w y are always lower-case. Letter q Q may appear in either case. No non-ASCII characters are used.
+* Unused ASCII: f k x z
+* ''c'' only appears in ''ch''; ''g'' only appears in ''ng'' and ''gh''; (lower-case) ''h'' only appears in ''ch'' and ''gh''.
+* Fairly frequent use of the apostrophe
+* Unusual trigraph ''tlh''
+* Common affixes: vI- yI-; -be' -'a' -moH -laH -mey -taHvIS -wI'
+* Common words: 'oH 'ej 'ach je neH 'e'
 ==Minza==
@@ Line 47: / Line 67: @@
 * Common words: ai, ba, ċi, die, en, fi, ida, ja, kam, keh, ła, łu, min, nu, öych, ři, šei, šö, vö, yn, zmi
-==Qþyn|gài==
+==[http://www.kunstsprachen.de/s7 Qþyn|gài]==
 * Non-ASCII: Þþ|ǂáíúýàìùỳ
 * Unused ASCII: bpmfvweoczj
@@ Line 76: / Line 96: @@
 http://www.nutter.net/sasxsek
-==[[Senjecas]]==
-===Latin script===
-*Non-ASCII consonants: <font color=blue>ɱ; þ; ð; ł; ß; к; ħ; ʒ</font> (yogh)
-**Until yogh is made available on Wiki, I am using <font color=blue>ʒ</font> ezh.
-**<font color=blue>к</font> = Kalaallisut (Greenlandic) [[wikipedia:kra (letter)|kra]]
-**On the conlang list: <font color=blue>ɱ = mh, ł = lh, ß = dz, к = k, ħ = jh, yogh = j, r = rh</font>
-*Breve under or over to indicate labialization: <font color=blue>ğ, ð̬</font>
-**On the conlang list labialization is indicated by <font color=blue>ü</font>
-*Cedilla under or apostrophe over to indicate palatalization: <font color=blue>ç, g̓</font>
-**On the conlang list palatalization is indicated by <font color=blue>ï</font>
-*Non-ASCII vowel: <font color=blue>ø</font>; all vowels with acute accent: <font color=blue>í, é, á, ǿ, ó, ú</font>; all vowels with double acute accent: <font color=blue>i̋, e̋, a̋, ø̋, ő, ű</font>
-**On the conlang list the double acute accent is replaced with a circumflex
-*Non-ASCII weak vowels: <font color=blue>ı, ɶ, æ</font>
-**On the conlang list </font color=blue>ı</font> = <font color=blue>ï</font>
-===Cyrillic script===
-*Non-Russian consonants
-**My own invention: <font color=blue>м̀</font>
-**Serbian: <font color=blue>ђ</font>
-**Macedonian: <font color=blue>љ; s; j
-**Tajik: <font color=blue>ғ</font>
-*Obsolete Russian consonants: <font color=blue>o̴</font>
-**<font color=blue>o̴</font> is the old Cyrillic letter fita, an "o" with a tilde through it, derived from the Greek <font color=blue>θ</font>.[http://www.omniglot.com/writing/cyrillic.htm] On the conlang list <font color=blue>θ</font> will be used.
-*Breve under or over to indicate labialization: <font color=blue>г̆, б̬</font>
-*Obsolete Russian vowel: <font color=blue>ѫ</font>; with acute accent: <font color=blue>и́, é, á, ѫ́, ó, ý</font>; with double acute accent: <font color=blue>и̋, e̋, a̋, ѫ̋, ő, y̋</font>
-**On the conlang list the double acute accent is replaced with a circumflex.
-**On the conlang list <font color=blue>ø</font> will be used for <font color=blue>ѫ</font> yus.
-*Palatalization is indicated with the iotacized vowels: <font color=blue>i, є, я, ё, ю</font> (Palatalization does not occur before <font color=blue>ѫ</font>. This is not a rule, it just so happens.).
-*Weak vowels
-**Russian: <font color=blue>ь</font>
-**Belarusian: <font color=blue>ў
-**BulgarianL <font color=blue>ъ</font>
-===Greek script===
-*Consonants: π/? - φ/β - μ̀/μ; τ/δ - θ/? - ?/λ; ?/? - σ/ζ - ρ/ν; к/γ - χ/? - ?/?
-*Vowels: ι, η, α, o, ω, υ; with acute accent: ί, ή, ά, ό, ώ, ύ; with double acute accent: ι̋, η̋, α̋, ο̋, ω̋, υ̋
-**On the conlang list the double acute accent is replaced with a circumflex
-*Weak vowels: ϊ, ϋ, ε
-y
-α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ ς σ τ υ φ χ ψ ω ϊ ϋ ό ύ ώ
- ϐ ϑ h ϙ ϡ
-===Other Scripts===
-The Armenian, Devanagari, Georgian, Hebrew and Tengwar alphabets have also been adapted for Senjecas.
 ==Tatari Faran==
-* Uses subset of Latin alphabet: a, b, d, e, f, h, i, j, k, m, n, o, p, r, s, t, u.
+* Uses subset of Latin alphabet: ''a, b, d, e, f, h, i, j, k, m, n, o, p, r, s, t, u.''
-* Unused letters: c, g, l, q, v, w, x, y, z.
+* Unused letters: ''c, g, l, q, v, w, x, y, z.''
 * No capitalization, not even in proper names.
 * Glottal stop in words, indicated by apostrophe (').
@@ Line 135: / Line 108: @@
 ==[[Terzemian]]==
 ===Latin script===
-* Non-ASCII: Åå Čč Ǧǧ Ňň Öö Šš Üü Žž
+* Non-ASCII: Ää Åå Čč Ǧǧ  Ġġ '''G̐g̐''' Ḳḳ Łł '''Ł̣ł̣''' Ňň Öö Šš Üü Žž ʼ  ˚ (bolded letters are apparently unique)
 * Unused ASCII: Jj Qq (except in foreign names)
-* Vowel Harmony groups:
-** Aa Ee Ii Öö Üü
-** Åå Oo Öö Uu Üü
-** Aa Åå Ee Oo Öö
-* Sentences generally start with a word (the verb) beginning with a multi-consonant cluster
-* Verb may have ''a'', ''e'', or ''ö'' prefixed to the initial cluster
-===Cyrillic script===
-* Non-Russian: Ғғ Ңң Өө Ўў Үү Һһ Ωω
-* Unused Russian: Ее Щщ Ъъ Ыы Ьь Юю Яя (Ыы sometimes used for non-harmonic or non-Terzemian vowels in foreign words)
-* Foreign names not originally written in Cyrillic may occur in Latin orthography
-* Vowel Harmony groups:
-** Аа Ээ Ии Өө Үү
-** Ωω Оо Өө Уу Үү
-** Аа Ωω Ээ Оо Өө
-* Sentences generally start with a word (the verb) beginning with a multi-consonant cluster
-* Verb may have ''а'', ''э'', or ''ө'' prefixed to the initial cluster
-===Arabic script===
-* Not in Standard Arabic: اً څ چ اِ گ ڽ اَ اَِ هَِ ۆ ژ
-* Standard but not used: ة ث ج ح ذ ص ض ط ظ ع ق ى
-* Vowel Harmony groups:
-** ا اِ هِ اَِ هَِ
-** اً اَ هَ هَِ
-** ا اً اِ اَ اَِ
-* Sentences generally start with a word (the verb) beginning with a multi-consonant cluster
-* Verb may have اِِ ,ا, or اَِ prefixed to the initial cluster
-==Þrjótrunn==
+==[http://www.kunstsprachen.de/s17/ Þrjótrunn]==
 * Non-ASCII: ÁÐÉÍÓÚÝÞÆÖáðéíóúýþæö
 * Unused ASCII: cqz
 * Combinations: pp tt kk gj ggj kj kkj
 * Frequent words: ún únn á í eð er þiss þissi þissa
+==[[Tauro-Piscean_language|Tauro-Piscean]]==
+*Latin: a b d e f g h i j k l m n o p r s t u v w z
+*Unused Latin: c q x y
+*Diacritics: ẍ x́ x̀ x̆ x̄ (found with all vowels), x̂ (found with e g s)
+*There are no digraphs
+*Doubled consonants used often, doubled vowels never used
+*Common words: an, tet, habb, zï, heonan, tonan, te, jo, Mann