Language Families

Language Families


Introduction to the more important language families including
Indo-European, Uralic, Altaic, Afro-Asiatic, Sino-Tibetan, Malayo-Polynesian and others


What are Language Families?

It appears that the use of language came about independently in a number of places.

All languages change with time. A comparison of Chaucer's English, Shakespere's English and Modern English shows how a language can change over several hundread years. Modern English spoken in Britain, North America and Australia use different words and grammar.

If two groups of people speaking the same language are separated, then their languages will change along different paths. First they develop different accents; next some of the vocabulary will change (either due to influences of other languages or by natural processes). When this happens a different dialect is created; the two groups can still understand each other. If the dialects continue to diverge there will come a time when they are mutually unintelligible (in other words, the people are speaking different languages).

When the Roman Empire collapsed in the 4th Centuary AD, all the Latin speakers in different parts of Europe (Italian Peninsula, Gaul, Iberian Peninsula, Carpathia) became isolated from each other. Their languages evolved along independent paths to give us the modern languages of Italian, French, Spanish, Portuguese and Romanian.

In time, with enough migrations, a single language can evolve into an entire family of languages.

Each language family described below is a group of related languages with a common ancestor. Languages in the same branch are sister languages that diverged within the last 1000 to 2000 years (Latin, for example, gave rise to the Latin Branch languages in the Indo-European Family).

Languages in different branches of the same family can be referred to as cousin languages. For most families these languages would have diverged more than 2000 years ago.

Languages in the same family, share many common grammatical features and many of the key words, especially older words, show their common origin. I'll show that with the word month in several Indo-European languages:

English month
Dutch maand
German Monat
Swedish månad
Welsh mis
Gaelic
French mois
Spanish mes
Portuguese mês
Italian mese
Polish miesiac
Russian myesyats
Lithuanian menuo
Albanian muaj
Greek minas
Farsi mâh
Hindi mahina

Compare that with the word for month in languages that are not Indo-European.

Arabic shahr
Finnish kuukausi
Basque hilabethe
Turkish ay

The difference between a language and a dialect can be political rather than linguistic. For example, Croatian and Serbian are linguistically closely related dialects of the same language. However, they are written in different scripts and are spoken by people of different religions living in Catholic Croatia and Orthodox Serbia respectively. As such they are called different languages for political reasons.

Macedonian is considered by Bulgarians as a dialect of their language while Macedonians themselves consider it a separate language. Since Bulgaria has long claimed Macedonia as part of its territory, the reasons for each view are obvious!

Low German (spoken in Northern Germany) and Dutch (Netherlands) are linguistically dialects but politically separate languages. Low German and Swiss German are mutually unintelligible but are both considered to be German. Similarly, the Arabic of Iraq and the Arabic of Morocco are both called Arabic but they differ greatly. The Mandarain speaking government of China considers China's other languages to be dialects. These political elements will be generally ignored in this essay.

The study of languages and their relationships gives us information about how people have migrated during historical times. It also helps with the dating of developments like plant domestication and the development of tools.

For the rest of this essay, an Atlas would be handy. We can now look at a selection of language families.


The Indo-European Family

The most widely studied language family in the world is the Indo-European. There are a number of reasons for this:

The Indo-European languages tend to be inflected (ie verbs and nouns have different endings depending on their part in a sentence). Some languages (eg English) have lost many of the inflections during their evolution.

The Indo-European languages stretch from the Americas through Europe all they way to North India. The Family is thought to have originated in the forests north of the Black Sea (in what is now Ukraine).

The Family is divided into ten branches. I will describe each of these separately.

The Celtic Branch

This is now the smallest branch. The languages originated in Central Europe and once dominated Western Europe. The people migrated across to the British Isles over 2000 years ago. Later, when the Germanic speaking Anglo Saxons arrived, the Celtic speakers were pushed into Wales (Welsh), Ireland (Irish Gaelic) and Scotland (Scottish Gaelic).

One group of Celts moved back to France. Their language became Breton spoken in the Brittany region of France. Breton is closer to Welsh than to French.

Other Celtic languages have became extinct. These include Cornish (Cornwall in England), Gaulish (France), Cumbrian (Wales), Manx (Isle of Man), Pictish (Scotland) and Galatian (spoken in Anatolia by the Galatians mentioned in the Christian New Testament).

Welsh has the word order Verb-Subject-Object in a sentence. Irish has the third oldest literature in Europe (after Greek and Latin).

The Germanic Branch

These languages originate from Old Norse and Anglo Saxon. Due to the influence of early Christian missionaries, the vast majority of the Celtic and Germanic languages use the Latin Alphabet.

They include English (the second most spoken language in the world, the most widespread, the language of technology, the language with the largest vocabulary). A useful language to have as your mother tongue.

Dutch and German are the closest major languages related to English. An even closer relative is Frisian. Flemish and Afrikaans are varieties of Dutch while Yiddish is a variety of German. Yiddish is written using the Hebrew script.

Three of the four Scandinavian languages belong to this branch: (Danish, Norwegian, and Swedish). Swedish has tones, unusual in European languages. The fourth Scandinavian language, Finnish, belongs to a different family.

Icelandic is the least changed of the Germanic Languages - being close to Old Norse. Another old language is Faroese.

Gothic (Central Europe), Frankish (France), Lombardo (Danube region), Visigoth (Iberian Peninsula) and Vandal (North Africa) are extinct languages from this branch.

German has a system of four cases and three genders for its nouns. Case is the property where a noun takes a different ending depending on its role in a sentence. An example in English would be the forms: lady, lady's, ladies and ladies'. The genders are masculine, feminine and neuter.

English has lost gender and case. Only a few words form their plurals like German (ox, oxen and man, men). Most now add an s, having been influenced by Norman French.

The Latin Branch

Also called the Italic or Romance Languages.

These languages are all derived from Latin. Latin is one of the most important classical languages. Its alphabet (derived from the Greek alphabet) is used by many languages of the world. Latin was long used by the scientific establishment and the Catholic Church as their means of communication.

Italian and Spanish are the closest modern major languages to Latin. French has moved farthest from Latin in pronunciation, only its spelling gives a clue to its origins. French has many Germanic and Celtic influences. Romanian has picked up Slavic influences because it is a Latin Language surrounded by a sea of Slavic speakers. Portuguese has been separate from Spanish for over 1000 years. The most important of these languages is Spanish, spoken in most of Latin America (apart from Brazil which is Portuguese speaking, and a few small states like Belize and Guyana).

Romansh is a minority language in Switzerland. Ladino was the language spoken by Spain's Jewish population when they were expelled in 1492. Most of them now live in Turkey and Israel. Provincial and Catalan are closely related languages spoken in the south of France and the north of Spain, respectively.

Note that Basque (spoken in parts of Spain and France) is not an Indo-European language - in fact it is totally unrelated to any other language of the world.

Galician is a Portuguese dialect with Celtic influences spoken in the north west of Spain. Finally, Moldavian is a dialect of Romanian spoken in the Moldova. Under the Soviets the Moldavians had to use the Cyrillic alphabet. Now they have reverted back to the Latin alphabet.

Apart from Latin, other extinct languages include Dalmatian, Oscan, Sabine and Umbrian.

Latin had three genders and at least six cases for its nouns and a Subject-Object-Verb sentence structure. Most modern Romance languages have only two genders, no cases and a Subject-Verb-Object structure.

The Slavic Branch

These languages are confined to Eastern Europe.

In general, the Catholic peoples use the Latin alphabet while the Orthodox use the Cyrillic alphabet which is derived from the Greek. Indeed some of the languages are very similar differing only in the script used (Croatian and Serbian are virtually the same language).

One of the oldest of these languages is Bulgarian. The most important is Russian. Others include Polish, Kashubian (spoken in parts of Poland), Sorbian (spoken in parts of eastern Germany), Czech, Slovak, Slovene, Macedonian, Bosnian, Ukrainian and Byelorussian.

The Slavic languages are famed for their consonant clusters and large number of cases for nouns (up to seven). Macedonian has three definate articles indicating distance; all are suffixes: VOL (ox), VOLOT (the ox), VOLOV (the ox here), VOLON (the ox there).

The Baltic Branch

Three Baltic states but only two Baltic Languages (Estonian is related to Finnish).

Lithuanian is one of the oldest of the Indo-European languages. Its study is important in determining the origins and evolution of the family. Lithuanian and Latvian both use the Latin script and have tones. Lithuanian has three numbers: singular, dual and plural.

Prussian is an extinct language from this branch

The Hellenic Branch

The only extant language in this branch is Modern Greek.

Greek is one of the oldest Indo-European languages. Mycenaean dates from 1300BC. The Ancient Greek of Homer was written from around 700BC. The major forms were Doric (Sparta), Ionic (Cos), Aeolic (Lesbos), and Attic (Athens). The latter is Classical Greek.

The New Testiment of the Christian Bible was written in a form of 1st Century AD Greek called Koine. This developed into the Greek of the Byzantine Empire. Modern Greek has developed from this.

Greek has three genders and four cases for nouns but no form of the verb infinitive. The language has its own script, derived from Phonoecian with the addition of symbols for vowels. It is one of the oldest alphabets in the world and has led to the Latin and Cyrillic alphabets. The Greek Alphabet is still used in science and mathematics.

The Illyric Branch

Another single language branch. Only Albanian (strongly influenced by the Slavic languages) belongs to this branch. It has been written in the Latin script since 1908; this replaced the Arabic script. Albanian has many avoidance words. Instead of saying wolf, the phrase may God close its mouth is used.

There are two dialects so different that they could be considered separate languages. Geg is spoken in the north of Albania and Kosovo. Tosk is spoken in southern Albania and north west Greece.

The Anatolian Branch

This branch includes the language of the Hittite civilisation which once ruled central Anatolia, fought the Ancient Egyptians and was mentioned in the Christain Bible's Old Testament. Other languages were Lydian (who ruled the south coast of Anatolia), Lycian (a Hellenic culture along the western coastal regions) , Luwian and Palaic.

All languages in this branch are extinct.

The Armenian Branch

This is represented by a single language, Armenian. It has its own script.

Armenian is spoken in Armenia and Nagorno-Karabakh (an enclave in Azerbaijan). The language is rich in consonants and has borrowed much of its vocabulary from Farsi (Iranian). Nouns have 7 cases and the past tense of verbs take an E prefix like Greek.

Armenian
Armenian

The Iranian Branch

These languages are descended from Ancient Persian, the literary language of the Persian Empire one of the great classical languages. Avestan is the extinct language of the Zoroastrian religion.

The main language of this branch is Farsi (also called Iranian), the language of Iran and much of Afghanistan. Kurdish is a close relation. Kurdish is spoken in Turkey, Syria, Iran and Iraq by the Kurds. It is the second largest of the Iranian languages after Farsi. In Turkey it was banned until recently.

Pashto is spoken in Afghanistan and parts of north west Pakistan. Baluchi is spoken in the desert regions between Iran, Afganistan and Pakistan. These languages are written in the Nastaliq script, a derivative of Arabic writing. It is interesting that you cannot tell which family a language belongs to by the way it is written.

Ossetian is found in the Caucasus mountains, north of Georgia. Tadzhik is a close relative of Farsi, written in Cyrillic and spoken in Tadzhikistan (of the former USSR).

The Indic Branch

This branch has the most languages. Most are found in North India. They are derived from Sanskrit (the classical language of Hinduism dating from 1000BC). This gave rise to Pali (the language of Buddhism), Ardhamagadhi (the language of Jainism) and the ancestors of the modern North Indian languages.

Of the modern North Indian languages, Hindi and Urdu are very similar but differ in the script. The Hindi speakers are Hindus and use the Sanskrit writing system called Devanagari (writing of the Gods). Urdu is spoken by the Muslims so uses the Arabic Nastaliq script. These two languages are found in north and central India and Pakistan. Nepali is closely related to Hindi.

Hindi

Hindi

In India most of the states have their own language. These languages either use Devanagari script or a derivation (if the people are Hindus) or the Arabic Nastaliq script (if the people are Muslims).

Bengali (West Bengal as well as Bangladesh), Oriya (in Orissa), Marathi (in Maharashtra), Assamese (in Assam), Punjabi (from the Punjab), Kashmiri (Kashmir), Sindhi (the Pakistan province of Sindh - written in Nastaliq), Gujerati (Gujerat in western India), Konkani (in Goa, an ex Portuguese colony, uses the Latin script), Sinhalese (Sri Lanka - uses its own script derived from Pali), Maldivian (Maldives - with its own script).

Bengali

Bengali

Punjabi

Punjabi

Oriya

Oriya

Sinhalese

Sinhalese

The most surprising language in this branch is Romany, the Gypsy's language. Gypsies migrated to Europe from India.

Sanskrit had three genders as has Marathi; most modern Indic languages have two genders; Bengali has none.

The fascinating point about India is that the south Indian languages (like Tamil) are not Indo-European. In other words, Hindi is related to English, Greek and French but is totally unrelated to Tamil. North Indians visiting Madras (in the south) are as baffled by Tamil as a foreigner would be.

The Tokharian Branch

Turfanian and Kuchean are recently identified extinct languages once spoken in north west China. Very little is known about this branch as only a few manuscripts dating from 600 AD are in existence. The closest relatives of these languages are the Celtic, Hittite and Latin branches.


Apart from the Indo-European Family, there are others.
A brief description of a few of these other families follows.


The Uralic Family

Not all European languages are Indo-European. There are three European languages that are members of the Uralic Family. The family is named from the Ural mountains. The people speaking these languages originated from the Siberian side of the Urals. Over 1500 years ago they migrated to Europe and have become entirely Europeanised. Their languages tell the story of their migrations.

In the Finnic Branch, Finnish and Estonian are closely related. Languages in the Ugric Branch (like Hungarian) are very different having separated from the Finnic ones around 3000 years ago. Hungarian's closest relatives (Ostyak, Vogul) are found in central Siberia.

The majority of the languages in this family are spoken in Siberia (Mordvin, Komi, Nenets) apart from Sámi which is spoken in Lapland (northern Scandinavia).

Yukaghir (spoken in eastern Siberia) uses a pre-literate form of pictograms similar to those of some native Americans.

The Uralic Languages have many suffixes. Finnish, for example, behaves as if it had 15 noun cases, Hungarian has 17. Country names in Finnish are difficult to recognise. Finland, for example, is Suomi. Mordvin has complex verbs varying for subject and object over four tenses and 7 moods.

The Altaic Family

The Altaic Family is named after the Alti Mountains, in Central Asia. These people were nomadic horsemen living in the plains. One group migrated towards Europe, the other group migrated towards the Korean Peninsula and the islands of Japan.

Turkish is the most westerly member of this family as well as the most spoken. Many of the others are spoken in former USSR republics (Azeri in Azerbaijan), Turkmen (in Turkmenia), Kazakh (in Kazakhstan), Kirghiz (in Kyrghystan), Uzbec (in Uzbekistan, land of Genghis Khan), Uigur (in Western China east of the Pamir Mountains).

Mongolian is found in Mongolia (where it is written in the Cyrillic script) and Northern China (with a script that goes down rather than horizontal). Korean and Japanese are the most easterly Altaic languages.

Mongolian

Mongolian

The scripts used by these languages depend on historical or political factors. Turkish use a Latin-based script, the ex-Soviet languages and Mongolian ones use the Cyrillic alphabet. Korean has its own peculiar script. Korean writing evolved separately from all the other scripts in the world, having been invented six hundred years ago. The language used to be written in Chinese characters.

Korean

Korean

Japanese is still written with Chinese characters (called Kanji) but there are two other alphabetic scripts. Hiragana is used to indicate prefixes and suffixes while Katakana is used for foreign words.

The Altaic languages have lots of suffixes and a property called vowel harmony. This means that the vowels are divided into two groups. Words will either have one type of vowel or the other. All the suffixes have two forms one for each type of vowel. In Turkish, the plural is formed by the addition of LER or LAR. The suffixes themselves can be glued on one after the other. For example, EV is house, EV-LER is houses, EVLER-IMIZ is our houses, EVLERIMIZ- E is to our houses, etc. Languages that behave in this manner are called agglutinating. Turkish is one of the most regular languages in the world. It has one irregular noun (water) and one irregular verb (to be).

Japanese and Korean have highly complex honourific forms for verbs depending on the social level of the speaker and the one spoken to.

All languages are influenced by languages they are in contact with. At the two extremes of the Altaic family, Turkish has many Arabic words while Korean and Japanese have many from Chinese.

Some linguists do not include Korean and Japanese in this family. Others link the Uralic and Altaic families together.

The Sino-Tibetan Family

The Sino-Tibetan Family is an important Asian family of languages. It contains the world's most spoken language, Mandarin.

The languages in this family are monosyllabic tonal languages. Words are made up of single syllables: Mandarin has over 1600. GUO - country, MEN - gate, WO - I, REN - person, AN - peace. The syllables themselves have tones. This means that the voice can be high, low, rising, falling, etc, just like singing. It is like the way many people raise the voice at the end of a question. As an example the syllable, MEN can mean gate or we depending on tone. Mandarin has four tones, Thai has five (MAI can mean not, burn, wood or no depending on tone), Cantonese has nine and Kam-Sui has 15.

The languages in the Sinitic Branch are the various languages of China (Mandarin, Cantonese, Wu, Gan, Min, Hakka, Xiang, Yue). They are all written in Chinese characters. Each syllable has a different character so that the writing is not alphabetic. There are over 50,000 characters, 6000 of which are needed to read a newspaper. Even though the different languages have different pronunciations, the meanings of characters are the same.

The languages in the Tibeto-Burman Branch are spoken in Burma (Burmese, Karen) Thailand and Laos (Lisu, Lahu), Southern China (Chin, Yi), Tibet (Tibetan), Bhutan(Jonkha), Nepal (Sherpa, Newari), and eastern India (Mizo, Manipuri).

Tibetan
Tibetan

When written, the scripts are derived either from the curly scripts of south India or the angular scripts of north India.

The Tai and Southern Branches are spoken in Thailand and Laos (Thai and Lao written in curly south Indian scripts, and the unwritten Shan) and amongst the tribal people of Southern China (Chuang, Yao, She).

Thai

Thai

Lao

Lao

Thai has noun classifiers. These are groups of words that go with certain types of nouns. KHON goes with people nouns (except royalty or sacred people), TUA goes with animals, LEM goes with sharp or pointed objects, KHAN goes with objects with handles.

The language family is thought to have originated in northern China around the Yangse River valley.

Some linguists consider the Tai Languages to be a separate family.

The Malayo-Polynesian Family

Also known as Austronesian, the Malayo-Polynesian Family is made up of over 1000 languages spread throughout the Indian and Pacific Oceans as well South East Asia. Although covering a large geographical area, the languages are remarkably uniform in structure.

The most common are Malay and Indonesian (which are actually dialects of a single language). Malay was written in the Arabic script until the 20th Century when the Latin alphabet was adopted.

Javanese

Javanese

This family includes the languages of Indonesia: Javanese, Sundanese, Madurese (all from Java), Batak (Sumatra), Balinese (Bali), Tetun (Timor). The languages of the Philippines (Tagalog, Ilocano, Visayan). The many non-Chinese languages of Taiwan (like Amis, Atayal, Paiwan, Tsou). These languages are found in Indo-China: Cham is spoken in Vietnam. It was the language of a pre-Vietnamese Hindu Chamba Empire. The present speakers are Muslim. In the Pacific, languages like Maori (New Zealand), Fijian, Tahitian, Rapa Nui (Easter Island), Chamorro (Cham), and Hawaiian.

An interesting exception is Malagasy, which is spoken in Madagascar, a large island off the coast of southern Africa. Its nearest linguistic relative is spoken in Borneo. Over 1500 years ago, people from the islands of Indonesia migrated in boats across the Indian Ocean to Madagascar. Here, they picked up African culture, but their language gives away their origins.

These languages have fairly simple noun and verb forms. Malay has no inflections for tense or case. Plurals are made by doubling the word (ANAK - child, ANAK ANAK - children). This is called Reduplication and is commonly used to enhance grammatical meanings. Passive forms of verbs are commonly used (let the guide be followed rather than follow the guide).

Javanese has a special vocabulary used to and by chiefs. Some peoples have secret languages used only by certain trades, like fishermen and miners. Balinese has three formal registers. The word eat is NAAR in the lowest formality, NEDA in the middle formailty, NGADJENGANG in the most formal. In Cham, men and women's speech differs.

The possessive pronouns (my / our) are more complex than the noun forms and have differing forms depending on the item possessed. In some of the Pacific languages, the possessive pronouns have a form for alienable possesion (something that is possessed temporarily like a car or book), and a form for inalienable possession (something that is always possessed like body parts).

Some languages have two forms of the personal pronoun, we. One form is used if it includes the person or people addressed (inclusive) and another form if the person addressed is not included (exclusive).

The Pacific languages are characterised by few consonants and vowels. Hawaiian has only 8 consonants (H, K, L, M , N, P, W and the glottal stop) and 5 vowels (A, E, I, O, U). There is a preference for open syllables (like in the names of the islands FI JI and TA HI TI).

Tagalog and Maori have a Verb-Subject-Object word order. Malagasay has the word order Verb-Object-Subject.

The speakers of this language family are thought to have originated in southern China (the Yellow River valleys) and migrated via Taiwan into the islands of the Philippines (about 2500BC), Indonesia and out into the Pacific (about 1000BC).

The Afro-Asiatic Family

The Afro-Asiatic Family is dominated by Arabic, an important modern and classical language. It is the language of the Quran and Islam.

Arabic
Arabic

The other languages in the Semitic Branch of this family are Maltese which is written in the Latin script because the Maltese are Catholic. Hebrew is another important classical language with its own script. It is the language of Judaism and of the Bible. By the 1st Century BC it had become a liturgical language for Judaism. A modern form was revived and is now spoken in Israel where it is called Ivrit.

Hebrew

Hebrew

Amharic is the language of Ethiopia and has its own script. Tigrinya is spoken in the Horn of Africa.

Many important ancient languages belong to this branch. Akkadian (the language of the Assyrian Empire) used the Cuniform writing system to write pre-Biblical flood and creation stories. Phonoecian and its closely related relatives Ugaritic (for which the alphabet was invented) and Punic (the language of Carthage). Nabatean, an ancestor of Arabic spoken in Petra. Syriac, a liturgical language of the early Christian church. The most interesting is Aramaic, once the administrative language of the Perisan Empire, later the language of Palestine during Roman times. It now survives in small pockets in Syria, Iraq, Turkey and Iran.

Syriac

Syriac

The Berber Branch is spoken in the hills of North Africa by the Berbers (Tuareg, Kabyle). Also in the branch was Guanch, spoken on the Canary Islands until becoming extinct in the 16th Century.

People in Ethiopia, Eritrea, Sudan and Somalia speak languages of the Cushitic Branch (Somali, Galla, Beja, Afar).

Hausa, the most important member of the Chadic Branch, is the main language of Nigeria. It was once written in the Arabic script but now uses the Latin alphabet. The Chadic Branch contains 600 languages spoken in Nigeria, Chad and Cameroon.

The Egyptian Branch contains Egyptian the language of Ancient Egypt written in hieroglyphics. Coptic, is the liturgical language of the Egyptian Coptic Church. It uses a Greek based alphabet. It is extinct as a spoken language.

Coptic

Coptic

These languages have grammars based on three consonant clusters. For example, in Arabic itself, the letter triplet KTB has to do with writing. KiTaB is book. Plurals are all irregularly formed and the usual way is to change the vowels. KuTuB is books. Other words with the KTB root have something to do with writing: KaTaBa - to write, KaTtaBa - to make someone to write (ie to teach), maKTaB - office, KaaTiB - clerk, maKTaBa - library, miKTaB - typewriter, KuTuBii - bookseller, maKTuuB - letter. The consonants give the root meaning while the vowels, suffixes and prefixes give the grammatical meaning.

The Arabic alphabet mainly uses consonants because the reader can supply the correct vowels from the context. The first Alphabets were invented by speakers of Semitic languages and so had no vowels.

Unusually for this family, Somali has 20 separate vowel sounds. It also has four tones which indicate gender, number and case.

This language family originated in the Sahara area before it became a desert and spread to the Horn of Africa, North Africa and the Middle East. During the 7th Century AD, Arabic spread from the Arabian Peninsula with Islam to cover most of North Africa and the Middle East.

The Caucasian Family

The Caucasian Family family is named after the Caucas Mountains between the Black Sea and the Caspian Sea. This is a very linguistically diverse region.

The languages include Georgian (Georgia), Chechen and Ingush (both found in Chechnya in southern Russia), and Avar (one of the many languages in a region called Dagetsan). Urartian (extinct language of the Urartu Empire of Eastern Turkey) also belongs to this family.

Georgian

Georgian

Some linguists consider that these languages may actually be three separate familes. Basque is sometimes considered to be related to these languages.

The languages are dominated by difficult consonant clusters. Ubykh (an extinct language whose last speaker died in 1992 in eastern Turkey) had 80 separate consonant sounds. Karbadian (spoken in southern Russia) has only three vowels which often disappear in speech.

Many of these languages have a large number of noun cases. Tsez (spoken in a small region between Georgia and Chechnya) has 42.

The languages also have a property called ergativity. This means that the subject of a transitive verb is different from the subject of an intransitive verb. Transitive verbs can take an object (see, hear); intransitive verbs cannot take an object (go, walk).

The Dravidian Family

North India is dominated by languages of the Indo-European Family.

The Dravidian Family of languages are the very difficult sounding languages of South India. These include the major languages Tamil (spoken in the Indian state of Tamil Nadu, northern Sri Lanka, Singapore and Malaysia), Malayalam (Kerala state), Kannada (from Karnataka) and Telegu (Andhra Pradesh). Each has its own script which has the curved appearance typical of South Indian writing.

Tamil

Tamil

Pockets of these languages are found in central India (Gondi, Kurukh, Kui), western India (Tulu) and in the Indus Valley of southern Pakistan (Brahui).

Elamite, a language known from inscriptions in Western Iran is now thought to have been Dravidian.

These languages are distinguished by retroflex constants, which have been borrowed by the Indic Branch of the Indo-European Languages. These constants give Indian languages their distinctive sound and are formed with the tongue rolled up to the top of the mouth. The languages are agglutinating with up to 8 noun cases.

The languages once covered all of the Indian sub-continent and originated in the Indus Valley (modern Pakistan).

Austro-Asiatic Family

The Austro-Asiatic Family are a scattered group of languages in Asia. They are found from eastern India to Vietnam. The family once covered a larger area until Tai language speakers migrated south from southern China.

The Viet-Muong Branch includes Vietnamese and Muong (both languages of Vietnam). The former is written in a form of the Latin script.

The Mon-Khmer Branch includes Khmer (the language of Cambodia written in a derivative of South Indian scripts), Mon (once a major language of a Thai empire; now spoken in parts of Burma, Thailand, China and Vietnam), Palaung (a tribal language in the hills of Burma and Thailand), So (Laos and Thailand), Nicobarese and Nancowry (both from the Nicobar Islands of the Indian Ocean).

Khmer

Khmer

The so-called Aslian languages are found in the hills of peninsular Malaysia and include Sengoi and Temiar.

The languages of the Munda Branch are found scattered in pockets of north India (Mundari, Santali in the state of Bihar and Khasi in Assam).

These languages are not tonal apart from Vietnamese where tones developed recently under Chinese influence. Vietnamese was once though not to be related to other languages. The branches of this family were originally considered to be separate familes.

Niger-Congo Family

The Niger-Congo Family features the many languages of Africa south of the Sahara. The family originated in West Africa. Migrations took the languages to eastern and southern Africa. There are over 900 languages in this family in nine branches.

Africa's borders reflect colonial history rather than linguistic boundaries. For this reason, many of these languages are spoken across national frontiers.

The languages of this family include the west African languages of Fulani (Nigeria, Cameroon, Mali, Guinea, Gambia, Senegal, Mauritania, Niger, Burkina Faso), Malinke (Senegal, Gambia, Guinea, Mali, Ivory Coast), Mende (Sierra Leone), Twi (Ghana), Ewe (Ghana, Togo), Mossi (Burkina Faso), Yoruba (Nigeria), Ibo (Nigeria), Kpelle (Liberia), Wolof (Senegal, Gambia) and Fang (Cameroon, Gabon, Guinea).

In east and southern Africa the languages include Swahili (Tanzania, Kenya, Uganda, Rwanda, Berundi, Zaire - the most spoken language in this family), Kikuyu (Kenya), Ganda (Uganda), Ruanda (Rwanda), Rundi (Berundi), Luba (Zaire), Lingala (Zaire, Congo), Kongo (Zaire, Congo, Angola), Bemba (Zaire, Zambia), Nyanja (Malawi, Zambia), Shona (Zimbabwe), Matebele (Zimbabwe), Tswana (Botswana), Sotho (South Africa, Lesotho), Swazi (Swaziland, South Africa), Xhosa (South Africa) and Zulu (South Africa).

The southern languages have tones which are used partially for meaning but mostly for grammar. Banda (Congo) has three tones. Its speakers use three-tone drums to send formulaic messages. Efik has four tones and uses m and n as vowels.

Most of the Niger-Congo languages have prefixes and suffixes to qualify nouns and verbs as well as words that agree with them. Nouns and verbs never exist on their own. Fulani has 18 suffixed noun qualifiers; Ndebele (Botswana) has 16 prefixed noun qualifiers and a large number of words for kinships: U-BABA (my father), U-YIHLO (your father), U-YISE (his father).

Shona has over 200 words for walking: MBWEMBWER (walk with buttocks shaking), CHAKWAIR (walk squelchily through mud), DONZV (walk with a stick), PANH (walk a long way), RAUK (walk with long steps). Fulani nouns have initial consonants that vary with gramatical meaning: JESO (face), GESE (faces), NGESA (big face).

The languages of the Bantu Branch, count in fives. The word for six, for example, is a compound of five and one.

Xhosa has 15 click consonants borrowed from the Khoisan Languages of southern Africa.


Other Language Families

There are over 100 language families in the world.

The Nilo-Saharan Family includes languages of North East Africa like Nubian of Southern Egypt and Sudan, Dinka and Masai from northern Kenya, and Songhai from the Niger River of West Africa. Originally spoken in the mountains of Ethiopia, this language family has remained close to its place of origin for 10,000 years.

* * * * *

In southern Africa there is small group of languages called the Khoisan Family. Two of its languages are Hottentot and Bushmen, spoken in Namibia and South Africa. These contain clicking consonants borrowed by neighbouring languages. This language family once covered most of central and southern Africa until displaced by migrations of Niger-Congo speakers.

* * * * *

The Eskimo-Aleut Family is spread across Siberia and Alaska (including the Aleutian Islands). The major language is Inuit (the Eskimo language). These languages are ergative. They also have the property of Incorporation where a verb can form a compound with one or more nouns allowing a complex sentence to be expressed as one single compound word.

* * * * *

The Algonkian Family of languages are found in North America and include Ojibwa, Cree, Blackfoot, Micmac, Cheyenne, and Delaware.

* * * * *

Another North American group is the Athapascan Family which includes Navajo and Apache.

* * * * *

Again in North America there is the Iroquoian Family. Cherokee and Mohawk are examples.

Mohawk marks the subject on the verb by gender so that word order is very free. This is similar to the languages of the Bantu Branch of the Niger-Congo languages in Africa.

* * * * *

Along the Pacific coast of North America is the unusual Mosan Family. The languages include Bella-Coola (a language with several words that lack vowels), Flathead and Okanagan.

These languages have word roots which can be either verbs of nouns. TS'AX can mean a spear or to spear. 'INMA means to suck milk or breast. 'ATH is either night or to become dark. Only the context distinguishes the correct meaning.

Some linguists divide these languages into three families.

* * * * *

Covering North and Central America is the Uto-Aztecan Family with languages like Hopi and Comanche from the USA and the cave-dwelling people of the Copper Canyon in Mexico (Tamahumara).

The most important language in this family is Nahuatl, the language of the Aztecs. The TL consonant is typical of the language. Nahuatl counts in fives.

* * * * *

Central Mexico is home to the Oto-Manguean Family which includes the languages Otomi, Mixtec and Zapotec. Chiquihuitlan Mazatec is a tonal language. The word CA can mean I talk, difficult, his hand or he talks depending on the value of one of four tones.

This family's 150 languages are divided into 7 branches.

* * * * *

The Mayan Family of languages are spoken by the descendants of the Mayas in southern Mexico and Guatemala (Quiche, Mam, Tzotzil, Cakchiquel, Yucatec). There are about 30 languages divided into 8 branches. they date from around 800BC.

* * * * *

A smaller group in Central America is the Macro-Chibchan Family. This includes Miskito (Honduran and Nicaraguan Caribbean coast) and Cuna (Panama).

* * * * *

The Carib Family is found scattered in the rain forests and coasts of northern South America. The languages include Carib (once spoken in the Caribbean islands), Ge, Panoan and Chiquito.

Hixkaryana (spoken by 350 people in the Brazilian rain forest) has the rare word order of Object-Verb-Subject. This word order is unknown outside of South America.

* * * * *

The Andean-Equatorial Family covers large areas of South America. It includes Quechua (the language of the Incas in Peru and Ecuador), Aymara (Bolivia), Guarani (Paraguay), Tupi (Brazil) and Arawak (Carribean Coasts).

* * * * *

Many of the 700 or so languages on the island of Papua New Guinea are still being studied and will probably be classified into six or seven major families (Torricelli, West Papuan, Sepik-Ramu) as well as a number of small families and unrelated languages. Most of the Papuan languages are spoken by a few thousand people and are little known. They include Enga, Motu, Maisin and Orokolo.

One common feature shared by several languages is the Dual Pronoun. This has a different form for we two and the normal we. This form also occurs with you and you two.

Kiwai has one of the most complex verb structures known. Suffixes and prefixes can make a verb into a complete sentence. For example, the verb ODI means string a bow.
RI-MI-BI-DU-MO-I-ODI-AI-AMA-RI-GO means in the remote future, they three will definately string two bows at a time.

Rotokas has the fewest sounds of any language, 11 (compared to the 44 of standard English). These 11 are made up of 5 vowels and 6 consonants: A, E, I, O, U, B, G, K, P, R, T.

Some linguists consider tha languages of the Andaman Islands and of Tasmania to be related to various Papuan languages.

* * * * *

Australia's 250 or so native languages have been tentatively classified into over 23 families. The north of the continent has the most variety with 22 families like Bunaban, Ngaran and Yiwaidjan. These complex languages have a large number of suffixes and prefixes changing the shade of meaning in subtle ways. Kunwinyku has prefixes for masculine, feminine, plant and other types.

The Pama-Nyungan languages of central and southern Australia are the most studied. They often have multiple pronouns. For example, there are four forms of the pronoun we: YUNMI (we two, you and I), MINTUPALA (we two, he and I), MIPALA (we all including you), MELABAT (we all excluding you). Jiwarli has three words for carrying depending on wheather an object is carried in the hand, on the head or on the back.

One interesting feature is the use of different vocabulary for communicating with different kin members. Adnyamathanha has ten sets of pronouns for use with various relatives. Panyjima speaking men use a different respectful vocabulary when talking to men who have initiated them into adulthood. In Dyirbal, there are two forms and virtually every word is different. Most languages have no counting words apart from one, two and many.

These languages have a long oral tradition that goes back 10,000 years. They tell stories of the period when land joined Australia to nearby islands, of extinct animals, and of contact with Europeans (including the massacres of their ancestors).

* * * * *

Some languages are totally unrelated to other languages. These are called Independent or Language Isolates. These include Ainu, spoken in isolated areas in Japan and now almost extinct. Basque is an ergative language spoken in the Pyrenees between France and Spain. The language counts in 20s, a property borrowed by French numbers above 60. Porome is spoken by less than 1000 people in Papua New Guinea and is without a writing system. Burushaski is another unwritten language spoken in a few valleys in northern Pakistani Kashmir.

* * * * *

The total number of languages in the world is estimated to be around 6000. Mexico has 52. The old USSR had 100. Nigeria has over 400. The island of Papua New Guinea has over 700, virtually a different one in each valley. India has over 800 languages in several families (Indo-European, Dravidian, Sino-Tibetan, Austro-Asiatic).

Unfortunately, with the onset of mass communications (rapid flights, radio, television, telephone, the internet), many smaller languages are in danger of extinction. With their passing, a unique cultural way of looking at the world disappears with them.

If this essay encourages the recording or saving of endangered languages, it will have been useful.

© 1997, 2001 Kryss Katsiavriades


Related Pages

Writing New
The evolution and development of the world's writing systems.

The Most Spoken Languages in the World
A table listing the 20 most spoken languages in the World along with their language families, scripts and estimated number of speakers.

Grammar
Grammatical terms and concepts like noun, verb, subject, object explained.


Links

The Web of Culture
A listing of languages by country and lots of resourses for linguists.

Yeoman's Word List
Word lists from several language families.


For more information search Encyclopaedia Britannica

 


KryssTal Banner

[Home Page] [Language Page]
[The English Language] [Borrowed Words in English] [Writing]
[The World's Most Spken Languages] [UK and USA English] [Cockney English]
[London Place Names] [Grammar] [It's a WORLD Wide Web]

Comments and contributions to Kryss at webmaster@krysstal.com

Readers' Feedback


Books From Amazon

Click on the ISBN Number to go straight to the book.
Amazon
COM
Amazon
Co UK
The Atlas of Languages : The Origin and Development of Languages Throughout the World is a detailed atlas of language families, full of maps and photos.
0816033889
0816033889
Atlas of the World's Languages is the reference book on languages. Over 4000 languages are covered, some with very few speakers.
0415019257
0415019257
Guns, Germs and Steel looks at the history of humanity in the last 13,000 years. Using linguistics, genetics and archaeology, it covers the major population movements during this period, and postulates reasons for them.
0393038912
0099302780

Visit Amazon by clicking on a logo below.

Amazon.COM    Amazon.Co.UK


This site is a member of the DigFor Languages Webring:
[ Previous | Next | Next 5 Sites | Random Site | List Sites ]


1