1 |
SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Improving Tokenisation by Alternative Treatment of Spaces ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Investigating alignment interpretability for low-resource NMT
|
|
|
|
In: ISSN: 0922-6567 ; EISSN: 1573-0573 ; Machine Translation ; https://hal.archives-ouvertes.fr/hal-03139744 ; Machine Translation, Springer Verlag, 2021, ⟨10.1007/s10590-020-09254-w⟩ (2021)
|
|
BASE
|
|
Show details
|
|
5 |
AStitchInLanguageModels: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Assessing the Representations of Idiomaticity in Vector Models with a Noun Compound Dataset Labeled at Type and Token Levels ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
The Role of negative information when learning dense word vectors ; O papel da informação negativa na aprendizagem de vetores palavra densos
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Investigating Language Impact in Bilingual Approaches for Computational Language Documentation
|
|
|
|
In: Proceedings of the 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020), ; SLTU-CCURL workshop, LREC 2020 ; https://hal.archives-ouvertes.fr/hal-02895907 ; SLTU-CCURL workshop, LREC 2020, May 2020, Marseille, France (2020)
|
|
BASE
|
|
Show details
|
|
10 |
Annotated corpora and tools of the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2)
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Investigating Language Impact in Bilingual Approaches for Computational Language Documentation ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings
|
|
|
|
In: Interspeech 2019 ; https://hal.archives-ouvertes.fr/hal-02193867 ; Interspeech 2019, Sep 2019, Graz, Austria (2019)
|
|
BASE
|
|
Show details
|
|
13 |
Unsupervised Compositionality Prediction of Nominal Compounds
|
|
|
|
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-02318196 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2019, 45 (1), pp.1-57. ⟨10.1162/coli_a_00341⟩ (2019)
|
|
BASE
|
|
Show details
|
|
14 |
How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages
|
|
|
|
In: Journées Scientifiques du Groupement de Recherche: Linguistique Informatique, Formelle et de Terrain (LIFT). ; https://hal.archives-ouvertes.fr/hal-02895895 ; Journées Scientifiques du Groupement de Recherche: Linguistique Informatique, Formelle et de Terrain (LIFT)., Nov 2019, Orléans, France (2019)
|
|
BASE
|
|
Show details
|
|
15 |
How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
CogniVal: A Framework for Cognitive Word Embedding Evaluation
|
|
|
|
In: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL) (2019)
|
|
BASE
|
|
Show details
|
|
17 |
Unsupervised Compositionality Prediction of Nominal Compounds
|
|
|
|
BASE
|
|
Show details
|
|
18 |
A small Griko-Italian speech translation corpus
|
|
|
|
In: 6th international workshop on spoken language technologies for under-resourced languages(SLTU'18) ; https://hal.archives-ouvertes.fr/hal-01962528 ; 6th international workshop on spoken language technologies for under-resourced languages(SLTU'18), Aug 2018, New Delhi, India (2018)
|
|
Abstract:
International audience ; This paper presents an extension to a very low-resource parallel corpus collected in an endangered language, Griko, making it useful for computational research. The corpus consists of 330 utterances (about 20 minutes of speech) which have been transcribed and translated in Italian, with annotations for word-level speech-to-transcription and speech-to-translation alignments. The corpus also includes morphosyntactic tags and word-level glosses. Applying an automatic unit discovery method, pseudo-phones were also generated. We detail how the corpus was collected, cleaned and processed, and we illustrate its use on zero-resource tasks by presenting some baseline results for the task of speech-to-translation alignment and unsu-pervised word discovery. The dataset is available online, aiming to encourage replicability and diversity in computational language documentation experiments.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
|
|
URL: https://hal.archives-ouvertes.fr/hal-01962528 https://hal.archives-ouvertes.fr/hal-01962528/document https://hal.archives-ouvertes.fr/hal-01962528/file/sltu2018.pdf
|
|
BASE
|
|
Hide details
|
|
19 |
Unsupervised Word Segmentation from Speech with Attention
|
|
|
|
In: Interspeech 2018 ; https://hal.archives-ouvertes.fr/hal-01818092 ; Interspeech 2018, Sep 2018, Hyderabad, India (2018)
|
|
BASE
|
|
Show details
|
|
20 |
Language, Cognition, and Computational Models
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01722351 ; Cambridge University Press, 2018 ; https://www.cambridge.org/core/books/language-cognition-and-computational-models/90CC7DBA6CADB1FE361266D311CB4413 (2018)
|
|
BASE
|
|
Show details
|
|
|
|