1 |
Multimodal Clustering with Role Induced Constraints for Speaker Diarization ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Neural Speech Decoding During Audition, Imagination and Production
|
|
|
|
In: IEEE (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Automated Quality Assessment of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Annotation and Evaluation of Coreference Resolution in Screenplays ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
An Automated Quality Evaluation Framework of Psychotherapy Conversations with Local Quality Estimates ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Deblurring for Spiral Real-Time MRI Using Convolutional Neural Networks ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Derivation of Fitts' law from the Task Dynamics model of speech production ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Feature Fusion Strategies for End-to-End Evaluation of Cognitive Behavior Therapy Sessions ...
|
|
|
|
Abstract:
Cognitive Behavioral Therapy (CBT) is a goal-oriented psychotherapy for mental health concerns implemented in a conversational setting with broad empirical support for its effectiveness across a range of presenting problems and client populations. The quality of a CBT session is typically assessed by trained human raters who manually assign pre-defined session-level behavioral codes. In this paper, we develop an end-to-end pipeline that converts speech audio to diarized and transcribed text and extracts linguistic features to code the CBT sessions automatically. We investigate both word-level and utterance-level features and propose feature fusion strategies to combine them. The utterance level features include dialog act tags as well as behavioral codes drawn from another well-known talk psychotherapy called Motivational Interviewing (MI). We propose a novel method to augment the word-based features with the utterance level tags for subsequent CBT code estimation. Experiments show that our new fusion ...
|
|
Keyword:
Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
|
|
URL: https://arxiv.org/abs/2005.07809 https://dx.doi.org/10.48550/arxiv.2005.07809
|
|
BASE
|
|
Hide details
|
|
12 |
Screenplay Quality Assessment: Can We Predict Who Gets Nominated? ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
An empirical analysis of information encoded in disentangled neural speaker representations ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Sentence level estimation of psycholinguistic norms using joint multidimensional annotations ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Variability in individual constriction contributions to third formant values in American English /ɹ/a)
|
|
|
|
In: J Acoust Soc Am (2020)
|
|
BASE
|
|
Show details
|
|
18 |
How an aglossic speaker produces an alveolar-like percept without a functional tongue tip
|
|
|
|
In: J Acoust Soc Am (2020)
|
|
BASE
|
|
Show details
|
|
19 |
Machine learning and natural language processing in psychotherapy research: Alliance as example use case
|
|
|
|
In: J Couns Psychol (2020)
|
|
BASE
|
|
Show details
|
|
20 |
Deblurring for Spiral Real-Time MRI Using Convolutional Neural Networks
|
|
|
|
In: Magn Reson Med (2020)
|
|
BASE
|
|
Show details
|
|
|
|