1 |
Speaker Attentive Speech Emotion Recognition
|
|
|
|
In: Proccedings of interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03554368 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.2866-2870, ⟨10.21437/interspeech.2021-573⟩ (2021)
|
|
BASE
|
|
Show details
|
|
2 |
Sequence-To-Sequence Voice Conversion using F0 and Time Conditioning and Adversarial Learning
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03569597 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03569608 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Sequence-To-Sequence Voice Conversion using F0 and Time Conditioning and Adversarial Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Att-HACK: An Expressive Speech Database with Social Attitudes
|
|
|
|
In: Speech Prosody ; https://hal.archives-ouvertes.fr/hal-02508362 ; Speech Prosody, May 2020, Tokyo, Japan (2020)
|
|
BASE
|
|
Show details
|
|
7 |
SEQUENCE-TO-SEQUENCE MODELLING OF F0 FOR SPEECH EMOTION CONVERSION
|
|
|
|
In: IEEE International Conference on Acoustics, Speech, and Signal Processing ; https://hal.sorbonne-universite.fr/hal-02018439 ; IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2019, Brighton, United Kingdom (2019)
|
|
BASE
|
|
Show details
|
|
8 |
« The annotation of syllabic prominences and disfluencies »
|
|
|
|
In: Rhapsodie: A prosodic and syntactic treebank for spoken French. Amsterdam: Benjamins, ; https://hal.archives-ouvertes.fr/hal-03324669 ; in Lacheret, A., Kahane, S. & Pietrandrea, P. (eds). Rhapsodie: A prosodic and syntactic treebank for spoken French. Amsterdam: Benjamins,, pp.157-173, 2019 (2019)
|
|
BASE
|
|
Show details
|
|
9 |
AUTOMATIC MODELLING AND LABELLING OF SPEECH PROSODY: WHAT'S NEW WITH SLAM+ ?
|
|
|
|
In: International Congress of Phonetic Sciences (ICPhS) ; https://hal.sorbonne-universite.fr/hal-02119926 ; International Congress of Phonetic Sciences (ICPhS), Aug 2019, Melbourne, Australia (2019)
|
|
BASE
|
|
Show details
|
|
10 |
At the Interface of Speech and Music: A Study of Prosody and Musical Prosody in Rap Music
|
|
|
|
In: Speech Prosody ; https://hal.sorbonne-universite.fr/hal-01722009 ; Speech Prosody, Jun 2018, Poznan, Poland (2018)
|
|
BASE
|
|
Show details
|
|
11 |
Score-Informed Syllable Segmentation for Jingju a Cappella Singing Voice with Mel-Frequency Intensity Profiles
|
|
|
|
In: International Workshop on Folk Music Analysis ; https://hal.sorbonne-universite.fr/hal-01513160 ; International Workshop on Folk Music Analysis, Jun 2017, Malaga, Spain (2017)
|
|
BASE
|
|
Show details
|
|
12 |
Score-Informed Syllable Segmentation For Jingju A Cappella Singing Voice With Mel-Frequency Intensity Profiles ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Score-Informed Syllable Segmentation For Jingju A Cappella Singing Voice With Mel-Frequency Intensity Profiles ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Score-Informed Syllable Segmentation For Jingju A Cappella Singing Voice With Mel-Frequency Intensity Profiles ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Vers une modélisation continue de la structure prosodique: le cas des proéminences syllabiques
|
|
|
|
BASE
|
|
Show details
|
|
16 |
A Source/Filter Model with Adaptive Constraints for NMF-based Speech Separation
|
|
|
|
In: International Conference on Acoustics, Speech, and Signal Processing ; https://hal.sorbonne-universite.fr/hal-01294681 ; International Conference on Acoustics, Speech, and Signal Processing, Mar 2016, Shanghai, China (2016)
|
|
BASE
|
|
Show details
|
|
17 |
Similarity Search of Acted Voices for Automatic Voice Casting
|
|
|
|
In: ISSN: 2329-9290 ; EISSN: 2329-9304 ; IEEE/ACM Transactions on Audio, Speech and Language Processing ; https://hal.sorbonne-universite.fr/hal-01464715 ; IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2016, 24 (9), pp.1642 - 1651. ⟨10.1109/TASLP.2016.2580302⟩ (2016)
|
|
BASE
|
|
Show details
|
|
18 |
Symbolic Modeling of Prosody: From Linguistics to Statistics
|
|
|
|
In: ISSN: 1558-7916 ; IEEE Transactions on Audio, Speech and Language Processing ; https://hal.archives-ouvertes.fr/hal-01164602 ; IEEE Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2015, 23 (3), pp.588 - 599. ⟨10.1109/TASLP.2014.2387389⟩ (2015)
|
|
BASE
|
|
Show details
|
|
19 |
Exploiting Alternatives for Text-To-Speech Synthesis: From Machine to Human
|
|
|
|
In: Speech Prosody in Speech Synthesis: Modeling and Generation of Prosody for High Quality and Flexible Speech Synthesis ; https://hal.archives-ouvertes.fr/hal-01164642 ; Springer Berlin Heidelberg. Speech Prosody in Speech Synthesis: Modeling and Generation of Prosody for High Quality and Flexible Speech Synthesis, pp.189-202, 2015, Prosody, Phonology and Phonetics, 978-3-662-45258-5. ⟨10.1007/978-3-662-45258-5_13⟩ (2015)
|
|
BASE
|
|
Show details
|
|
20 |
The Role of Glottal Source Parameters for High-Quality Transformation of Perceptual Age
|
|
|
|
In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP) ; https://hal.archives-ouvertes.fr/hal-01164562 ; International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia (2015)
|
|
Abstract:
International audience ; The intuitive control of voice transformation (e.g., age/sex, emotions) is useful to extend the expressive repertoire of a voice. This paper explores the role of glottal source parameters for the control of voice transformation. First, the SVLN speech synthesizer (Separation of the Vocal-tract with the Liljencrants-fant model plus Noise) is used to represent the glottal source parameters (and thus, voice quality) during speech analysis and synthesis. Then, a simple statistical method is presented to control speech parameters during voice transformation : a GMM is used to model the speech parameters of a voice, and regressions are then used to adapt the GMMs statistics (mean and variance) to a control parameter (e.g., age/sex, emotions). A subjective experiment conducted on the control of perceptual age proves the importance of the glottal source parameters for the control of voice transformation, and shows the efficiency of the statistical model to control voice parameters while preserving a high-quality of the voice transformation.
|
|
Keyword:
[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing; [STAT.ML]Statistics [stat]/Machine Learning [stat.ML]; glottal source and vocal tract; statistical modelling; voice transformation
|
|
URL: https://hal.archives-ouvertes.fr/hal-01164562 https://hal.archives-ouvertes.fr/hal-01164562/document https://hal.archives-ouvertes.fr/hal-01164562/file/index.pdf
|
|
BASE
|
|
Hide details
|
|
|
|