Log in Help
Print
Homesalegate-flyer2007 〉 notes.txt
 
Multilinguality
The standard distribution has tools for many languages, including Arabic, Chinese, French, German, Italian. All textual data is stored internally in Unicode, various other encodings are supported for input/output.

Robust
Reference implementations provided for many basic HLT tools like tokenisation, part-of-speech tagging, finite state information extarction, etc.