Language Identification Based on the Variations in Intonation Using Discrete Markov-chain Model

Shinjini Ghosh

Shinjini Ghosh
South Point High School, Kolkata, West Bengal
Title: Language Identification Based on the Variations in Intonation Using Discrete Markov-chain Model
Speaking plays an indispensable role in communication. We utilize our vocal apparatus to speak through various articulatory processes including variations in the manner, place and the intonation or frequency. Apart from the inventory of sounds, we also extensively use variations in the tone to convey additional meaning(s). Communicating in a language restricts one to follow the usual tonal transitions of that language or language family.
In this project we try to make use of this to understand the patterns in languages. We develop a discrete Markov-chain model of variations in intonation. Instead of analysing absolute pitch or frequency, we analyse how one tone transitions to another in speech. We take these sequences of transitional intonations and create N-grams. We train the model using these N-grams and compute their probabilities.
We have developed an algorithm to determine the language being spoken in an audio sample using the Markov speech model. We are analysing the results and the factors affecting its performance such as training size and order (context size).
From this study it is found that intonation provides a key insight into the characteristic identity of a language. The accuracy of identification is expected to be order dependent.