1 |
Unsupervised quantification of entity consistency between photos and text in real-world news ...
|
|
Müller-Budack, Eric. - : Hannover : Institutionelles Repositorium der Leibniz Universität Hannover, 2022
|
|
BASE
|
|
Show details
|
|
2 |
COSMO-Onset: A Neurally-Inspired Computational Model of Spoken Word Recognition, Combining Top-Down Prediction and Bottom-Up Detection of Syllabic Onsets
|
|
|
|
In: ISSN: 1662-5137 ; Frontiers in Systems Neuroscience ; https://hal.archives-ouvertes.fr/hal-03318691 ; Frontiers in Systems Neuroscience, Frontiers, 2021, 15, pp.653975. ⟨10.3389/fnsys.2021.653975⟩ (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Online activation of L1 Danish orthography enhances spoken word recognition of Swedish
|
|
|
|
In: ISSN: 0332-5865 ; Nordic Journal of Linguistics ; https://hal-amu.archives-ouvertes.fr/hal-03283527 ; Nordic Journal of Linguistics, 2021, pp.1-19. ⟨10.1017/S0332586521000056⟩ (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Hand-gesture recognition based on EMG and event-based camera sensor fusion: a benchmark in neuromorphic computing
|
|
|
|
In: ISSN: 1662-4548 ; EISSN: 1662-453X ; Frontiers in Neuroscience ; https://hal.archives-ouvertes.fr/hal-02617084 ; Frontiers in Neuroscience, Frontiers, 2020, pp.36 ; https://www.frontiersin.org/ (2020)
|
|
BASE
|
|
Show details
|
|
7 |
Neuroplasticity in Visual Word Recognition: An Exploration of Learning-Related Behavioural and Neural Changes ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
The temporal dynamics of first and second language processing: ERPs to spoken words in Mandarin-English bilinguals
|
|
|
|
In: Brain and Mind Institute Researchers' Publications (2020)
|
|
BASE
|
|
Show details
|
|
9 |
Traitement neuronal des voix et familiarité : entre reconnaissance et identification du locuteur
|
|
|
|
BASE
|
|
Show details
|
|
10 |
The Role of Surface and Underlying Forms When Processing Tonal Alternations in Mandarin Chinese: A Mismatch Negativity Study
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Challenges in Audio Processing of Terrorist-Related Data
|
|
|
|
In: International Conference on Multimedia Modeling ; https://hal.archives-ouvertes.fr/hal-02415176 ; International Conference on Multimedia Modeling, Springer, Jan 2019, Thessaloniki, Greece (2019)
|
|
BASE
|
|
Show details
|
|
12 |
Challenges in Audio Processing of Terrorist-Related Data
|
|
|
|
In: International Conference on Multimedia Modeling ; https://hal.archives-ouvertes.fr/hal-02387373 ; International Conference on Multimedia Modeling, Springer, Jan 2019, Thessaloniki, Greece (2019)
|
|
BASE
|
|
Show details
|
|
13 |
Does the prosodic emphasis of sentential context cause deeper lexical-semantic processing?
|
|
|
|
In: ISSN: 2327-3798 ; EISSN: 2327-3801 ; Language, Cognition and Neuroscience ; https://hal.univ-lille.fr/hal-01917002 ; Language, Cognition and Neuroscience, Taylor and Francis, 2019, 34, pp.29-42. ⟨10.1080/23273798.2018.1499945⟩ (2019)
|
|
BASE
|
|
Show details
|
|
14 |
Acoustic event, spoken keyword and emotional outburst detection
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Event Structure In Vision And Language
|
|
|
|
In: Publicly Accessible Penn Dissertations (2019)
|
|
BASE
|
|
Show details
|
|
16 |
Investigating the Electrophysiology of Long-Term Priming in Spoken Word Recognition
|
|
|
|
In: ETD Archive (2018)
|
|
BASE
|
|
Show details
|
|
17 |
Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-Term Memory, Deep Neural Networks
|
|
Qiu, Liang. - : eScholarship, University of California, 2018
|
|
In: Qiu, Liang. (2018). Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-Term Memory, Deep Neural Networks. UCLA: Electrical Engineering 0303. Retrieved from: http://www.escholarship.org/uc/item/1pz29229 (2018)
|
|
Abstract:
Non-linguistic Vocalization Recognition refers to the detection and classification of non-speech voice such as laughter, sneeze, cough, cry, screaming, etc. It could be seen as a subtask of Acoustic Event Detection (AED). Great progress has been made by previous research to increase the accuracy of AED. On the front end, multiple kinds of features such as Mel-Frequency Cepstral Coefficients (MFCCs), Gammatone Cepstral Coefficients (GTCCs) and many other hand-crafted features were explored. While on the back end, models or methods such as Gaussian Mixture Models (GMMs), Hidden Markov Models (HMMs), Bags-of-Audio-Words (BoAW), Support Vector Machine (SVM) and various types of neural networks were experimented. Recent researches on Automatic Speech Recognition (ASR) and Acoustic Scene Classification (ASC) show the advantage of using Convolutional, Long Short-Term Memory, Deep Neural Networks (CLDNNs) on audio processing tasks. In this thesis, I am building a non-linguistic vocalization recognition system using CLDNNs. Log Mel-filterbank coefficients are adopted as input features and data augmentation methods such as random shifting and noise mixture are discussed. The built system is evaluated on a custom dataset collected from several resources and tested for real time application. The performance of CLDNNs for non-linguistic vocalization recognition is also compared with hybrid GMM-SVMs, Convolutional Neural Networks, Long Short-Term Memory and a fully connected Deep Neural Network trained on VGGish embeddings. The results indicate that CLDNNs outperform the other models in classification precision and recall. Visualization of CLDNNs are presented to help understand the framework. The model is proved accurate and fast enough for real time applications.
|
|
Keyword:
acoustic event detection; Artificial intelligence; CLDNNs; Computer science; Electrical engineering; non-linguistic vocalization recognition
|
|
URL: http://www.escholarship.org/uc/item/1pz29229
|
|
BASE
|
|
Hide details
|
|
20 |
Abstract Concepts and Pictures of Real-World Situations Activate One Another.
|
|
|
|
In: Psychology Publications (2018)
|
|
BASE
|
|
Show details
|
|
|
|