PISIT' S THAI NATURAL LANGUAGE PROCESSING LABORATORY
This lab is formed since August 26, 1998
e-mail: pisitp@yahoo.com
For C7 members, please check this C7 address list.
KEYWORDS
Thai Natural Language Processing Lab., words
segmentation, dictionaries, algorithms, Thai text-to-speech.
PROJECTS
- Thai text-to-speech system. There is a number of works regarding this project including Thai words segmentation, sentence intonations, phoneme inventory, dictionary based pronunciation, morphological derivative for proper noun and unknown words, disambiguous module, volume, speed rate and pitch control etc.
Hear me ! or Thai speech : ¿Ñ§¼Á ..!
To trial some other languages TTS system such as bell labs TTS or Laureate at british telecom, please click the links.
-
Thai word segmentation algorithm. This project is on-going with r&d
work in the algorithm improvement.
-
Thai word segmentation algorithm V1.0.
This version uses sorted linked list as its data structure. The dictionary
size is 3,963 words which are the high usage frequency words. The algorithm
performance is O(n).
-
Thai word segmentation algorithm V2.0. This
version uses Trie as its data structure. The dictionary size is 17,859
words which are the general words. The algorithm performance is O(log n).
-
Thai word segmentation algorithm V2.1. This
version is a callable Thai word segmentation. It comes up with new test text. It is also correct two digit words problem in V2.0 .
-
Thai word segmentation algorithm V2.2. This
version is an improvement version of the callable Thai word segmentation. It comes up with new improved dictionary (19,246 words), improved memory management, pretty much improved performance from V2.1 (result from memory management improvement and correct invalid two prefix match problem in 2.1
-
Thai electronic dictionary. Various types of the dictionary are constructed
as well as development of datastructure to store the dict.
- Thai Corpus. This project is to build online database of Thai raw data. This data will be used for experiment, test and evaluation of other PsTNLP projects.
- [Thai Syllable Parsing Algorithm] and [Approximation Matching Algorithm for Thai] are available for download and test. Please don' t hesitate to send me an email comment for it.
CONTACT
Email me at pisitp@yahoo.com for your comment and/or discussions.
This page hosted by
Get your own Free Home Page