2 |
MassiveSumm: a very large-scale, very multilingual, news summarisation dataset ...
|
|
|
|
Abstract:
Anthology paper link: https://aclanthology.org/2021.emnlp-main.797/ Abstract: Current research in automatic summarisation is unapologetically anglo-centred - a persistent state-of-affairs, which also predates neural net approaches. High-quality automatic summarisation datasets are notoriously expensive to create, posing a challenge for any language. However, with digitalisation, archiving, and social media advertising of newswire articles, recent work has shown how, with careful methodology application, large-scale datasets can now be simply gathered instead of written. In this paper, we present a large-scale multi-lingual summarisation dataset containing articles in 92 languages, spread across 28.8 million articles, in more than 35 writing scripts. This is both the largest, most inclusive, exist- ing automatic summarisation dataset, as well as one of the largest, most inclusive, ever published datasets for any NLP task. We present the first investigation on the efficacy of resource building from news ...
|
|
URL: https://dx.doi.org/10.48448/8thm-zg55 https://underline.io/lecture/37700-massivesumm-a-very-large-scale,-very-multilingual,-news-summarisation-dataset
|
|
BASE
|
|
Hide details
|
|
4 |
IAPUCP at SemEval-2021 task 1: Stacking fine-tuned transformers is almost all you need for lexical complexity prediction
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Predicting Declension Class from Form and Meaning
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
7 |
The Paradigm Discovery Problem
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
8 |
A Tale of a Probe and a Parser
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
9 |
A Corpus for Large-Scale Phonetic Typology
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
10 |
Information-Theoretic Probing for Linguistic Structure
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
11 |
It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
12 |
ASSET: A dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Non-linear instance-based cross-lingual mapping for non-isomorphic embedding spaces
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Classification-based self-learning for weakly supervised bilingual lexicon induction
|
|
|
|
BASE
|
|
Show details
|
|
15 |
On the limitations of cross-lingual encoders as exposed by reference-free machine translation evaluation
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Multilingual Projection for Parsing Truly Low-Resource Languageš
|
|
|
|
In: EISSN: 2307-387X ; Transactions of the Association for Computational Linguistics ; https://hal.inria.fr/hal-01426754 ; Transactions of the Association for Computational Linguistics, The MIT Press, 2016 (2016)
|
|
BASE
|
|
Show details
|
|
18 |
Treebank-Based Deep Grammar Acquisition for French Probabilistic Parsing Resources
|
|
Schluter, Natalie. - : Dublin City University. National Centre for Language Technology (NCLT), 2011. : Dublin City University. School of Computing, 2011
|
|
In: Schluter, Natalie (2011) Treebank-Based Deep Grammar Acquisition for French Probabilistic Parsing Resources. PhD thesis, Dublin City University. (2011)
|
|
BASE
|
|
Show details
|
|
19 |
Dependency parsing resources for French: Converting acquired lexical functional grammar F-Structure annotations and parsing F-Structures directly
|
|
|
|
In: Schluter, Natalie and van Genabith, Josef orcid:0000-0003-1322-7944 (2009) Dependency parsing resources for French: Converting acquired lexical functional grammar F-Structure annotations and parsing F-Structures directly. In: Nodalida 2009 Conference, 14 - 16 May 2009, Odense, Denmark. (2009)
|
|
BASE
|
|
Show details
|
|
20 |
Treebank-based acquisition of LFG parsing resources for French
|
|
|
|
In: Schluter, Natalie and van Genabith, Josef (2008) Treebank-based acquisition of LFG parsing resources for French. In: the Sixth International Language Resources and Evaluation Conference (LREC'08), May 28-30, 2008, Marrakech, Morocco. (2008)
|
|
BASE
|
|
Show details
|
|
|
|