1 |
Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice ...
|
|
|
|
Abstract:
Due to the migration megatrend, efficient and effective second-language acquisition is vital. One proposed solution involves AI-enabled conversational agents for person-centered interactive language practice. We present results from ongoing action research targeting quality assurance of proprietary generative dialog models trained for virtual job interviews. The action team elicited a set of 38 requirements for which we designed corresponding automated test cases for 15 of particular interest to the evolving solution. Our results show that six of the test case designs can detect meaningful differences between candidate models. While quality assurance of natural language processing applications is complex, we provide initial steps toward an automated framework for machine learning model selection in the context of an evolving conversational agent. Future work will focus on model selection in an MLOps setting. ... : Accepted for publication in the Proc. of the 1st International Conference on AI Engineering, 2022 ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; FOS Computer and information sciences; Software Engineering cs.SE
|
|
URL: https://dx.doi.org/10.48550/arxiv.2203.15414 https://arxiv.org/abs/2203.15414
|
|
BASE
|
|
Hide details
|
|
2 |
Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Similarity between person roles in a card sorting experiment ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Ensemble of Opinion Dynamics Models to Understand the Role of the Undecided in the Vaccination Debate ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Generating Authentic Adversarial Examples beyond Meaning-preserving with Doubly Round-trip Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Pirá: A Bilingual Portuguese-English Dataset for Question-Answering about the Ocean ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
A comparative study of several parameterizations for speaker recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
A Neural Pairwise Ranking Model for Readability Assessment ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
A bilingual approach to specialised adjectives through word embeddings in the karstology domain ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Speaker verification in mismatch training and testing conditions ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Universal Conditional Masked Language Pre-training for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
SMDT: Selective Memory-Augmented Neural Document Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Learning How to Translate North Korean through South Korean ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation? ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Conditional Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|