1 |
Automatic Speech Recognition and Query By Example for Creole Languages Documentation
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL 2022 ; https://hal.archives-ouvertes.fr/hal-03625303 ; Findings of the Association for Computational Linguistics: ACL 2022, May 2022, Dublin, Ireland (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Cross-Situational Learning Towards Robot Grounding
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
|
|
Abstract:
How do children acquire language through unsupervised or noisy supervision? How do their brain process language? We take this perspective to machine learning and robotics, where part of the problem is understanding how language models can perform grounded language acquisition through noisy supervision and discussing how they can account for brain learning dynamics. Most prior works have tracked the co-occurrence between single words and referents to model how infants learn wordreferent mappings. This paper studies cross-situational learning (CSL) with full sentences: we want to understand brain mechanisms that enable children to learn mappings between words and their meanings from full sentences in early language learning. We investigate the CSL task on a few training examples with two sequence-based models: (i) Echo State Networks (ESN) and (ii) Long-Short Term Memory Networks (LSTM). Most importantly, we explore several word representations including One-Hot, GloVe, pretrained BERT, and fine-tuned BERT representations (last layer token representations) to perform the CSL task. We apply our approach to three diverse datasets (two grounded language datasets and a robotic dataset) and observe that (1) One-Hot, GloVe, and pretrained BERT representations are less efficient when compared to representations obtained from fine-tuned BERT. (2) ESN online with final learning (FL) yields superior performance over ESN online continual learning (CL), offline learning, and LSTMs, indicating the more biological plausibility of ESNs and the cognitive process of sentence reading. (2) LSTM with fewer hidden units showcases higher performance for small datasets, but LSTM with more hidden units is Cross-Situational Learning needed to perform reasonably well on larger corpora. (4) ESNs demonstrate better generalization than LSTM models for increasingly large vocabularies. Overall, these models are able to learn from scratch to link complex relations between words and their corresponding meaning concepts, handling polysemous and synonymous words. Moreover, we argue that such models can extend to help current human-robot interaction studies on language grounding and better understand children's developmental language acquisition. We make the code publicly available * .
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]; [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]; [INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO]; [SDV.NEU]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]; BERT; cross-situational learning; echo state networks; grounded language; LSTM
|
|
URL: https://hal.archives-ouvertes.fr/hal-03628290 https://hal.archives-ouvertes.fr/hal-03628290v2/file/Journal_of_Social_and_Robotics.pdf https://hal.archives-ouvertes.fr/hal-03628290v2/document
|
|
BASE
|
|
Hide details
|
|
3 |
Cross-Situational Learning Towards Robot Grounding
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Emergent Communication for Understanding Human Language Evolution: What's Missing? ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Multimodal neural networks better explain multivoxel patterns in the hippocampus ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
End-to-end speaker segmentation for overlap-aware resegmentation
|
|
|
|
In: Interspeech 2021 ; https://hal-univ-lemans.archives-ouvertes.fr/hal-03257524 ; Interspeech 2021, Aug 2021, Brno, Czech Republic ; https://www.interspeech2021.org/ (2021)
|
|
BASE
|
|
Show details
|
|
7 |
High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features
|
|
|
|
In: EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO) ; https://hal.archives-ouvertes.fr/hal-03537323 ; EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO), Jan 2021, Amsterdam, Netherlands. pp.71-75, ⟨10.23919/Eusipco47968.2020.9287637⟩ (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Tackling Morphological Analogies Using Deep Learning -- Extended Version
|
|
|
|
In: https://hal.inria.fr/hal-03425776 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
9 |
Recognizing lexical units in low-resource language contexts with supervised and unsupervised neural networks
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03429051 ; [Research Report] LACITO (UMR 7107). 2021 (2021)
|
|
BASE
|
|
Show details
|
|
10 |
What does the Canary Say? Low-Dimensional GAN Applied to Birdsong
|
|
|
|
In: https://hal.inria.fr/hal-03244723 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
11 |
What does the Canary Say? Low-Dimensional GAN Applied to Birdsong
|
|
|
|
In: https://hal.inria.fr/hal-03244723 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Artificial Text Detection via Examining the Topology of Attention Maps
|
|
|
|
In: ACL Anthology ; Empirical Methods in Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-03456191 ; Empirical Methods in Natural Language Processing, ACL (Association for Computational Linguistics), Nov 2021, Punta Cana, Dominican Republic (2021)
|
|
BASE
|
|
Show details
|
|
13 |
Modeling the neural network responsible for song learning ; Modélisation du réseau neuronal responsable de l'apprentissage du chant chez l'oiseau chanteur
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03217834 ; Modeling and Simulation. Université de Bordeaux, 2021. English. ⟨NNT : 2021BORD0107⟩ (2021)
|
|
BASE
|
|
Show details
|
|
14 |
Multimodal Coarticulation Modeling : Towards the animation of an intelligible talking head ; Modélisation de la coarticulation multimodale : vers l'animation d'une tête parlante intelligible
|
|
|
|
In: https://hal.univ-lorraine.fr/tel-03203815 ; Intelligence artificielle [cs.AI]. Université de Lorraine, 2021. Français. ⟨NNT : 2021LORR0019⟩ (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Impact of Segmentation and Annotation in French end-to-end Synthesis
|
|
|
|
In: Proc. 11th ISCA Speech Synthesis Workshop (SSW 11) ; SSW 11th ISCA Speech Synthesis Workshop ; https://hal.archives-ouvertes.fr/hal-03362000 ; SSW 11th ISCA Speech Synthesis Workshop, Aug 2021, Budapest, Hungary. pp.13-18, ⟨10.21437/SSW.2021-3⟩ ; https://ssw11.hte.hu/ (2021)
|
|
BASE
|
|
Show details
|
|
16 |
Which Hype for my New Task? Hints and Random Search for Reservoir Computing Hyperparameters
|
|
|
|
In: ICANN 2021 - 30th International Conference on Artificial Neural Networks ; https://hal.inria.fr/hal-03203318 ; ICANN 2021 - 30th International Conference on Artificial Neural Networks, Sep 2021, Bratislava, Slovakia (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Canary Song Decoder: Transduction and Implicit Segmentation with ESNs and LTSMs
|
|
|
|
In: https://hal.inria.fr/hal-03203374 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
18 |
Which Hype for my New Task? Hints and Random Search for Reservoir Computing Hyperparameters
|
|
|
|
In: https://hal.inria.fr/hal-03203318 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
19 |
Canary Song Decoder: Transduction and Implicit Segmentation with ESNs and LTSMs
|
|
|
|
In: ICANN 2021 - 30th International Conference on Artificial Neural Networks ; https://hal.inria.fr/hal-03203374 ; ICANN 2021 - 30th International Conference on Artificial Neural Networks, Sep 2021, Bratislava, Slovakia. pp.71--82, ⟨10.1007/978-3-030-86383-8_6⟩ ; https://link.springer.com/chapter/10.1007/978-3-030-86383-8_6 (2021)
|
|
BASE
|
|
Show details
|
|
20 |
On the use of Self-supervised Pre-trained Acoustic and Linguistic Features for Continuous Speech Emotion Recognition
|
|
|
|
In: IEEE Spoken Language Technology Workshop ; https://hal.archives-ouvertes.fr/hal-03003469 ; IEEE Spoken Language Technology Workshop, Jan 2021, Virtual, China (2021)
|
|
BASE
|
|
Show details
|
|
|
|