2 |
Compiling terminological data using comparable corpora: from term extraction to dictionaries
|
|
|
|
In: 34th Annual Conference of the German Linguistic Society (DGfS) ; https://hal.archives-ouvertes.fr/hal-00819590 ; 34th Annual Conference of the German Linguistic Society (DGfS), Mar 2012, Frankfurt, Germany (2012)
|
|
BASE
|
|
Show details
|
|
3 |
Terminology Extraction, Translation Tools and Comparable Corpora: TTC concept, midterm progress and achieved results
|
|
|
|
In: LREC 2012 Workshop on Creating Cross-language Resources for Disconnected Languages and Styles (CREDISLAS) ; https://hal.archives-ouvertes.fr/hal-00819909 ; LREC 2012 Workshop on Creating Cross-language Resources for Disconnected Languages and Styles (CREDISLAS), May 2012, Istanbul, Turkey. 4 p (2012)
|
|
Abstract:
International audience ; The TTC project (Terminology Extraction, Translation Tools and Comparable Corpora) has contributed to leveraging computer-assisted translation tools, machine translation systems and multilingual content (corpora and terminology) management tools by generating bilingual terminologies automatically from comparable corpora in seven EU languages, as well as Russian and Chinese. This paper presents the main concept of TTC, discusses the issue of parallel corpora scarceness and potential of comparable corpora, and briefly describes the TTC terminology extraction workflow. The TTC terminology extraction workflow includes the collection of domain-specific comparable corpora from the web, extraction of monolingual terminology in the two domains of wind energy and mobile technology, and bilingual alignment of extracted terminology. We also present TTC usage scenarios , the way in which the project deals with under-resourced and disconnected languages, and report on the project midterm progress and results achieved during the two years of the project. And finally, we touch upon the problem of under-resourced languages (for example, Latvian) and disconnected languages (for example, Latvian and Russian) covered by the project.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; comparable corpora; computer-assisted translation; disconnected languages; language resources; machine translation; terminology extraction; under-resourced languages
|
|
URL: https://hal.archives-ouvertes.fr/hal-00819909/file/TTC_LREC_CREDISLAS_2012.pdf https://hal.archives-ouvertes.fr/hal-00819909/document https://hal.archives-ouvertes.fr/hal-00819909
|
|
BASE
|
|
Hide details
|
|
4 |
Reference Lists for the Evaluation of Term Extraction Tools
|
|
|
|
In: Proceedings of the 10th Terminology and Knowledge Engineering Conference (TKE'12) ; Proceedings of the 10th Terminology and Knowledge Engineering Conference (TKE 12) ; Terminology and Knowledge Engineering Conference (TKE) ; https://hal.archives-ouvertes.fr/hal-00816566 ; Terminology and Knowledge Engineering Conference (TKE), Jun 2012, Madrid, Spain. http://www.oeg-upm.net/tke2012/proceedings (2012)
|
|
BASE
|
|
Show details
|
|
5 |
Identifying and Grouping Variants of Technical Terms on the Basis of Text Corpora
|
|
|
|
In: 33rd Annual Conference of the German Linguistic Society (DGfS) ; https://hal.archives-ouvertes.fr/hal-00818647 ; 33rd Annual Conference of the German Linguistic Society (DGfS), Feb 2011, Göttingen, Germany (2011)
|
|
BASE
|
|
Show details
|
|
6 |
Simple methods for dealing with term variation and term alignment
|
|
|
|
In: 9th International Conference on Terminology and Artificial Intelligence (TIA 2011) ; https://hal.archives-ouvertes.fr/hal-00819376 ; 9th International Conference on Terminology and Artificial Intelligence (TIA 2011), Nov 2011, Paris, France. pp.87-93 (2011)
|
|
BASE
|
|
Show details
|
|
|
|