1 |
TopiOCQA: Open-domain Conversational Question Answering with Topic Switching
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 468-483 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
2 |
PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 414-433 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
3 |
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 376-392 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
4 |
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 434-451 (2022) (2022)
|
|
Abstract:
AbstractStandard multi-task benchmarks are essential for developing pretraining models that can generalize to various downstream tasks. Existing benchmarks for natural language processing (NLP) usually focus only on understanding or generating short texts. However, long text modeling requires many distinct abilities in contrast to short texts, such as the modeling of long-range discourse and commonsense relations, and the coherence and controllability of generation. The lack of standardized benchmarks makes it difficult to assess these abilities of a model and fairly compare different models, especially Chinese models. Therefore, we propose a story-centric benchmark named LOT for evaluating Chinese long text modeling, which aggregates two understanding tasks and two generation tasks. We construct new datasets for these tasks based on human-written Chinese stories with hundreds of words. Furthermore, we release an encoder-decoder-based Chinese long text pretraining model named LongLM with up to 1 billion parameters. We pretrain LongLM on 120G Chinese novels with two generative tasks including text infilling and conditional continuation. Extensive experiments show that LongLM outperforms similar-sized pretraining models substantially on both the understanding and generation tasks in LOT.
|
|
Keyword:
Computational linguistics. Natural language processing; P98-98.5
|
|
URL: https://doi.org/10.1162/tacl_a_00469 https://doaj.org/article/7bbfb2e6c1604607b4dd50538f6a9550
|
|
BASE
|
|
Hide details
|
|
5 |
Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 393-413 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Time-Aware Language Models as Temporal Knowledge Bases
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 257-273 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Neuro-symbolic Natural Logic with Introspective Revision for Natural Language Inference
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 240-256 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
8 |
Les débuts de la phraséologie et les premières « phraséologies historiques » italo-françaises
|
|
|
|
In: Linguistik Online, Vol 113, Iss 1 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
9 |
Formen und Funktionen des Konjunktivs II in historischen ostoberdeutschen Predigten.
|
|
|
|
In: Linguistik Online, Vol 114, Iss 2 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
10 |
Zur Sprachdynamik des Konjunktivs im Bairischen in Österreich
|
|
|
|
In: Linguistik Online, Vol 114, Iss 2 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
11 |
Die Konjunktiv-II-Bildung im Kontext von Partikelverben in den Basisdialekten Salzburgs
|
|
|
|
In: Linguistik Online, Vol 114, Iss 2 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
12 |
Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students?
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 359-375 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
13 |
Informationen zu den Beitragenden/Information about the authors
|
|
|
|
In: Linguistik Online, Vol 113, Iss 1 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
14 |
A Coordenação na Gramática Discursivo-Funcional
|
|
|
|
In: Linguistik Online, Vol 113, Iss 1 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
15 |
Der Konjunktiv II in den ruralen Basisdialekten Österreichs.
|
|
|
|
In: Linguistik Online, Vol 114, Iss 2 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
16 |
Konjunktiv II-Variation im urbanen Sprachgebrauch in Österreich
|
|
|
|
In: Linguistik Online, Vol 114, Iss 2 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
17 |
Der Konjunktiv II in Salzburger Varietäten: Grammatik, Gebrauch, soziale Faktoren
|
|
|
|
In: Linguistik Online, Vol 114, Iss 2 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
18 |
Informationen zu den Beitragenden/Information about the authors
|
|
|
|
In: Linguistik Online, Vol 115, Iss 3 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
19 |
„Frische Brot, lecker Brot, taze Brot“ – Eigene Muster in der Adjektivverwendung auf einem mehrsprachigen Wochenmarkt
|
|
|
|
In: Linguistik Online, Vol 115, Iss 3 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
20 |
Ein Modell zur systematischen Erfassung genuenischer Phraseme in Wörterbüchern, illustriert am Beispiel der Forschungsprojekte GEPHRAS und GEPHRAS2
|
|
|
|
In: Linguistik Online, Vol 115, Iss 3 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
|
|