Language identification

The goal of language identification is to deduce or correctly guess the language of a certain text with statistical methods. It is also called language guessing (1).

Klingon has some unique characteristics which can be useful to identify the language (2):
  • the characters differ in capitalization: q and Q are different characters, D, I and S only appear capitalized, ch, tlh and v only appear non-capitalized.
  • the apostrophe appears relatively often, many times at the beginning or the end of a syllable (See Phonology).
  • the apostrophe also appears in the middle of a word, sometimes as a double.
  • -be' and -'a' appear frequently as a suffix.

What Klingon does NOT have:
  • There are no letters f, k, x, z.
  • No word starts with a vowel.

guesslanguage.js(3) is an existing open source project, which enables the recognition of Klingon and additional languages via JavaScript. It works with statistical data on the base of n-grams.

See also

References

1 : Language identification on Wikipedia, retrieved 21 October 2016

2 : Wikipedia:Language recognition chart, retrieved 21 October 2016

3 : Project page on GitHub, retrieved 21 October 2016

External links

Category: General    Latest edit: 24 Jul 2017, by KlingonTeacher    Created: 24 Feb 2017 by DirkSchlSser
 
The Klingon Language Wiki is a private fan project to promote the Klingon language. See Copyright notice for details.