Project

Terminology Extraction for Sematic Interoperability and Standardization.

Acronym

TExSIS

Code

179J5611

Duration

01 January 2011 → 31 December 2012

Funding

Regional and community funding: IWT/VLAIO

Promotor

Filip De Turck

Research disciplines

Natural sciences
- Applied mathematics in specific fields

Keywords

terminology extraction system Translation Studies Language technology

Project description

The TExSIS Project to build a fully automated terminology extraction system on the fly terms extracted from mono and multilingual documents. The architecture is built largely language independent, and will be further elaborated during the Project for Dutch, French, German and English. TExSIS is offered in a powerful client-server architecture, can have different input and output formats, and can run either fully automatic or semi-automatic. The Project will provide an open source prototype terminology extractor which can be further specifically customized and implemented by software developers to end users. It will lead to a reduction of the implementation cost of machine translation, a reduction of manual correction work on machine translation, automatic production of monolingual and multilingual dictionaries, an automated consistency checks, the automatic production of thesauri for existing archives, automatic metadata to documents accelerated searching large databases and archives. This practical utility is demonstrated in the project in two use cases covering a wide range of users: machine translation and information retrieval.