Home
Catalogue search
Refine your search:
Keyword:
deep learning, computational model, multimodal, audiovisual, speech, predictive coding (3)
[INFO.INFO-AI]Computer Science [cs] / Artificial Intelligence [cs.AI] (1)
[SCCO.LING]Cognitive science / Linguistics (1)
[SPI.SIGNAL]Engineering Sciences [physics] / Signal and Image processing (1)
[STAT.ML]Statistics [stat] / Machine Learning [stat.ML] (1)
Creator / Publisher
Year
Medium
Type:
Miscellaneous (3)
Article (1)
BLLDB-Access
Search in the Catalogues and Directories
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
Sort by
creator [A → Z]
'
creator [Z → A]
'
publishing year ↑ (asc)
'
publishing year ↓ (desc)
'
title [A → Z]
'
title [Z → A]
'
Simple Search
Hits 1 – 4 of 4
1
Evaluating the Potential Gain of Auditory and Audiovisual Speech-Predictive Coding Using Deep Learning
Hueber, Thomas
;
Tatulli, Eric
;
Girin, Laurent
;
Schwartz, Jean-Luc
In: ISSN: 0899-7667 ; EISSN: 1530-888X ; Neural Computation ; https://hal.archives-ouvertes.fr/hal-03016083 ; Neural Computation, Massachusetts Institute of Technology Press (MIT Press), 2020, 32 (3), pp.596-625. ⟨10.1162/neco_a_01264⟩ (2020)
Abstract:
International audience ; Sensory processing is increasingly conceived in a predictive framework in which neurons would constantly process the error signal resulting from the comparison of expected and observed stimuli. Surprisingly, few data exist on the accuracy of predictions that can be computed in real sensory scenes. Here, we focus on the sensory processing of auditory and audiovisual speech. We propose a set of computational models based on artificial neural networks (mixing deep feedforward and convolutional networks), which are trained to predict future audio observations from present and past audio or audiovisual observations (i.e., including lip movements). Those predictions exploit purely local phonetic regularities with no explicit call to higher linguistic levels. Experiments are conducted on the multispeaker LibriSpeech audio speech database (around 100 hours) and on the NTCD-TIMIT audiovisual speech database (around 7 hours). They appear to be efficient in a short temporal range (25–50 ms), predicting 50% to 75% of the variance of the incoming stimulus, which could result in potentially saving up to three-quarters of the processing power. Then they quickly decrease and almost vanish after 250 ms. Adding information on the lips slightly improves predictions, with a 5% to 10% increase in explained variance. Interestingly the visual gain vanishes more slowly, and the gain is maximum for a delay of 75 ms between image and predicted sound.
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
;
[SCCO.LING]Cognitive science/Linguistics
;
[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
;
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]
URL:
https://hal.archives-ouvertes.fr/hal-03016083/document
https://hal.archives-ouvertes.fr/hal-03016083/file/Hueber.pdf
https://doi.org/10.1162/neco_a_01264
https://hal.archives-ouvertes.fr/hal-03016083
BASE
Hide details
2
Deeppredspeech: Computational Models Of Predictive Speech Coding Based On Deep Learning ...
Hueber, Thomas
;
Tatulli, Eric
;
Girin, Laurent
. - : Zenodo, 2018
BASE
Show details
3
DeepPredSpeech: computational models of predictive speech coding based on deep learning ...
Hueber, Thomas
;
Tatulli, Eric
;
Girin, Laurent
. - : Zenodo, 2018
BASE
Show details
4
DeepPredSpeech: computational models of predictive speech coding based on deep learning ...
Hueber, Thomas
;
Tatulli, Eric
;
Girin, Laurent
. - : Zenodo, 2018
BASE
Show details
Mobile view
All
Catalogues
UB Frankfurt Linguistik
0
IDS Mannheim
0
OLC Linguistik
0
UB Frankfurt Retrokatalog
0
DNB Subject Category Language
0
Institut für Empirische Sprachwissenschaft
0
Leibniz-Centre General Linguistics (ZAS)
0
Bibliographies
BLLDB
0
BDSL
0
IDS Bibliografie zur deutschen Grammatik
0
IDS Bibliografie zur Gesprächsforschung
0
IDS Konnektoren im Deutschen
0
IDS Präpositionen im Deutschen
0
IDS OBELEX meta
0
MPI-SHH Linguistics Collection
0
MPI for Psycholinguistics
0
Linked Open Data catalogues
Annohub
0
Online resources
Link directory
0
Journal directory
0
Database directory
0
Dictionary directory
0
Open access documents
BASE
4
Linguistik-Repository
0
IDS Publikationsserver
0
Online dissertations
0
Language Description Heritage
0
© 2013 - 2024 Lin|gu|is|tik
|
Imprint
|
Privacy Policy
|
Datenschutzeinstellungen ändern