1 |
How much context span is enough? Examining context-related issues for document-level MT
|
|
|
|
In: Castilho, Sheila orcid:0000-0002-8416-6555 (2022) How much context span is enough? Examining context-related issues for document-level MT. In: 13th Language Resources and Evaluation Conference, 21-23 June 2022, Marseille, France. (In Press) (2022)
|
|
BASE
|
|
Show details
|
|
2 |
An investigation of English-Irish machine translation and associated resources
|
|
Dowling, Meghan. - : Dublin City University. School of Computing, 2022. : Dublin City University. ADAPT, 2022
|
|
In: Dowling, Meghan orcid:0000-0003-1637-4923 (2022) An investigation of English-Irish machine translation and associated resources. PhD thesis, Dublin City University. (2022)
|
|
Abstract:
As an official language in both Ireland and the European Union (EU), there is a high demand for English-Irish (EN-GA) translation in public administration. The difficulty that translators currently face in meeting this demand leads to the need for reliable domain-specific user-driven EN-GA machine translation (MT). This landscape provides a timely opportunity to address some research questions surrounding MT for the EN-GA language pair. To this end, we assess the corpora available for training data-driven MT systems, including publicly-available data, data collected through EU-supported data collection efforts and web-crawling, showing that though Irish is a low-resource language it is possible to increase the corpora available through concerted data collection efforts. We investigate how increased corpora affect domain-specific (public administration) statistical MT (SMT) and neural MT (NMT) systems using automatic metrics. The effect that different SMT and NMT parameters have on these automatic values is also explored, using sentence-level metrics to identify specific areas where output differs greatly between MT systems and providing a linguistic analysis of each. With EN-GA SMT and NMT automatic evaluation scores showing inconclusive results, we investigate the usefulness of EN-GA hybrid MT through the use of monolingual data as a source of artificial data creation via backtranslation. We evaluate these results using automatic metrics and linguistic analysis. Although results indicate that the addition of artificial data did not have a positive impact on EN-GA MT, repeated experiments involving Scottish Gaelic show that the method holds promise, given suitable conditions. Finally, given that the intended use-case of EN-GA MT is in the workflow of a professional translator, we conduct an in-depth human evaluation study for EN-GA SMT and NMT, providing a human-derived assessment of EN-GA MT quality and comparison of EN-GA SMT and NMT. We include a survey of translator opinions and recommendations surrounding EN-GA SMT and NMT as well as an analysis of data gathered through the post-editing of MT output. We compare these results to those generated automatically and provide recommendations for future work on EN-GA MT, in particular with regards to its use in a professional translation workflow within public administration.
|
|
Keyword:
Artificial intelligence; Computational linguistics; Linguistics; Machine learning; Machine translating; Translating and interpreting
|
|
URL: http://doras.dcu.ie/26574/
|
|
BASE
|
|
Hide details
|
|
3 |
One model for the learning of language.
|
|
|
|
In: Proceedings of the National Academy of Sciences of the United States of America, vol 119, iss 5 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Computational Measures of Deceptive Language: Prospects and Issues
|
|
|
|
In: ISSN: 2297-900X ; EISSN: 2297-900X ; Frontiers in Communication ; https://hal.archives-ouvertes.fr/hal-03629780 ; Frontiers in Communication, Frontiers, 2022, 7, pp.792378. ⟨10.3389/fcomm.2022.792378⟩ (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Animal linguistics in the making: the Urgency Principle and titi monkeys’ alarm system
|
|
|
|
In: ISSN: 0394-9370 ; Ethology Ecology and Evolution ; https://hal.inrae.fr/hal-03518874 ; Ethology Ecology and Evolution, Taylor & Francis, 2022, pp.1-17. ⟨10.1080/03949370.2021.2015452⟩ (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Finding the best way to put media bias research into practice via an annotation app ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Labour market discrimination and biases in human judgement and Artificial Intelligence ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Movies with imaginary worlds cluster together because of exploration-related terms in plot summaries ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Finding the best way to put media bias research into practice through an annotation app ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Maschinelle Übersetzung (MT) für den Notfall : Ratgeber zum Einsatz von MT Tools für die Kommunikation mit Flüchtlingen aus der Ukraine ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Neural machine translation and language teaching : possible implications for the CEFR ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
From bag-of-words towards natural language: adapting topic models to avoid stop word removal ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Linked Open Tafsir - Rekonstruktion der Entstehungsdynamik(en) des Korans mithilfe der Netzwerkmodellierung früher islamischer Überlieferungen ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
DiaCollo für GEI-Digital - Ein experimentelles Projekt zur weiteren Erschließung digitalisierter historischer Schulbuchbestände ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
DiaCollo für GEI-Digital - Ein experimentelles Projekt zur weiteren Erschließung digitalisierter historischer Schulbuchbestände ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
DiaCollo für GEI-Digital - Ein experimentelles Projekt zur weiteren Erschließung digitalisierter historischer Schulbuchbestände ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Linked Open Tafsir - Rekonstruktion der Entstehungsdynamik(en) des Korans mithilfe der Netzwerkmodellierung früher islamischer Überlieferungen ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Eine agentenbasierte Architektur für Programmierung mit gesprochener Sprache ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|