Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 9 of 9

1	Learning emotions latent representation with CVAE for Text-Driven Expressive AudioVisual Speech Synthesis
	Dahmani, Sara; Colotte, Vincent; Girard, Valérian; Ouni, Slim
	In: ISSN: 0893-6080 ; Neural Networks ; https://hal.inria.fr/hal-03204193 ; Neural Networks, Elsevier, 2021, 141, pp.315-329. ⟨10.1016/j.neunet.2021.04.021⟩ (2021)
	Abstract: International audience ; Great improvement has been made in the field of expressive audiovisual Text-to-Speech synthesis (EAVTTS) thanks to deep learning techniques. However, generating realistic speech is still an open issue and researchers in this area have been focusing lately on controlling the speech variability.In this paper, we use different neural architectures to synthesize emotional speech. We study the application of unsupervised learning techniques for emotional speech modeling as well as methods for restructuring emotions representation to make it continuous and more flexible. This manipulation of the emotional representation should allow us to generate new styles of speech by mixing emotions. We first present our expressive audiovisual corpus. We validate the emotional content of this corpus with three perceptual experiments using acoustic only, visual only and audiovisual stimuli.After that, we analyze the performance of a fully connected neural network in learning characteristics specific to different emotions for the phone duration aspect and the acoustic and visual modalities.We also study the contribution of a joint and separate training of the acoustic and visual modalities in the quality of the generated synthetic speech.In the second part of this paper, we use a conditional variational auto-encoder (CVAE) architecture to learn a latent representation of emotions. We applied this method in an unsupervised manner to generate features of expressive speech. We used a probabilistic metric to compute the overlapping degree between emotions latent clusters to choose the best parameters for the CVAE. By manipulating the latent vectors, we were able to generate nuances of a given emotion and to generate new emotions that do not exist in our database. For these new emotions, we obtain a coherent articulation. We conducted four perceptual experiments to evaluate our findings.
	Keyword: [MATH.MATH-MG]Mathematics [math]/Metric Geometry [math.MG]; [SCCO.COMP]Cognitive science/Computer science; [SCCO.LING]Cognitive science/Linguistics; [SDV.OT]Life Sciences [q-bio]/Other [q-bio.OT]; [SHS.INFO]Humanities and Social Sciences/Library and information sciences; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing; [STAT.ML]Statistics [stat]/Machine Learning [stat.ML]; bidirectional long short-term memory (BLSTM); conditional variationalauto-encoder; deeplearning; emotion; Expressive audiovisual speech synthesis; Expressive talking avatar; facial expression
	URL: https://doi.org/10.1016/j.neunet.2021.04.021 https://hal.inria.fr/hal-03204193/document https://hal.inria.fr/hal-03204193 https://hal.inria.fr/hal-03204193/file/neural_networks_journal-8.pdf
	BASE
	Hide details

2	Audio-driven speech animation using recurrent neutral network
	Ouni, Slim; Biasutto--Lervat, Théo; Dahmani, Sara
	In: https://hal.inria.fr/hal-03167213 ; United States, Patent n° : WO2021023861. 2021 (2021)
	BASE
	Show details

3	Some consideration on expressive audiovisual speech corpus acquisition using a multimodal platform [<Journal>]
	Dahmani, Sara [Verfasser]; Colotte, Vincent [Verfasser]; Ouni, Slim [Verfasser]
	DNB Subject Category Language
	Show details

4	Some consideration on expressive audiovisual speech corpus acquisition using a multimodal platform
	Dahmani, Sara; Colotte, Vincent; Ouni, Slim
	In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-02907046 ; Language Resources and Evaluation, Springer Verlag, 2020, ⟨10.1007/s10579-020-09500-w⟩ ; https://link.springer.com/article/10.1007%2Fs10579-020-09500-w (2020)
	BASE
	Show details

5	Audiovisual synthesis of expressive speech : modeling of emotions with deep learning ; Synthèse audiovisuelle de la parole expressive : modélisation des émotions par apprentissage profond
	Dahmani, Sara. - : HAL CCSD, 2020
	In: https://hal.inria.fr/tel-03079349 ; Informatique [cs]. Université de Lorraine, 2020. Français. ⟨NNT : 2020LORR0137⟩ (2020)
	BASE
	Show details

6	Modeling Labial Coarticulation with Bidirectional Gated Recurrent Networks and Transfer Learning
	Biasutto--Lervat, Théo; Dahmani, Sara; Ouni, Slim
	In: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-02175780 ; INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria (2019)
	BASE
	Show details

7	Conditional Variational Auto-Encoder for Text-Driven Expressive AudioVisual Speech Synthesis
	Dahmani, Sara; Colotte, Vincent; Girard, Valérian...
	In: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-02175776 ; INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria (2019)
	BASE
	Show details

8	Acoustic and Visual Analysis of Expressive Speech: A Case Study of French Acted Speech
	Ouni, Slim; Colotte, Vincent; Dahmani, Sara...
	In: Interspeech 2016 ; https://hal.inria.fr/hal-01398528 ; Interspeech 2016, ISCA, Nov 2016, San Francisco, United States. pp.580 - 584, ⟨10.21437/Interspeech.2016-730⟩ ; http://www.interspeech2016.org (2016)
	BASE
	Show details

9	Is markerless acquisition of speech production accurate ?
	Ouni, Slim; Dahmani, Sara
	In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.inria.fr/hal-01315579 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2016, EL234, 139 (6), ⟨10.1121/1.4954497⟩ ; http://scitation.aip.org/content/asa/journal/jasael (2016)
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern