6 |
Technical Implications of Multilingual Corpus Lexicography
|
|
|
|
Abstract:
Two main areas are covered: the problem of space constraints and the possible machine implementation of the methodology outlined in the other papers in this collection. The notion that the advent of electronic dictionaries means that the space constraints inherent in producing books will no longer be a problem fails to take account of the vast amount of corpus data that is required for thorough analysis of all aspects of a natural language. At the same time, working with a very large corpus means that the lexicographer is becoming more dependent on software tools to interrogate the corpus. In time it may be advantageous to hold data in processed form, but at present raw data remains too valuable. The possibility that the methodology proposed by the project could be implemented by machine is discussed. IBM work on parallel corpora is cited as an example of how a similar project could be successful. The importance of colligational information on top of simple collocational information is stressed, although the problems inherent in establishing semantic categories are referred to. An approach currently being adopted at COBUILD together with other dictionary-based work is seen as potentially fruitful.
|
|
Keyword:
Articles
|
|
URL: http://ijl.oxfordjournals.org/cgi/content/short/9/3/265 https://doi.org/10.1093/ijl/9.3.265
|
|
BASE
|
|
Hide details
|
|
|
|